Tee Si tf) ‘plate tectonics PAGE 140 


= MATURE OMsMATURE 


HASTHETHESIS | LET'SBEMORE | THEIDEAOF as 
HADITS DAY? SCIENTIFIC LANDSCAPE Tl ° a 
The past, present amd future Thirty years of honing From “Copabiline’ Brown aE 
ofthe Ph Dassessment peer review to US national parks j oo ase See Tea 
PACES 77 £6 PACE aT AGES4 


THIS WEEK 


ZIKA Virus is just one of 
multiple birth defect 
threats p.8 


EDITORIALS 


WORLD VIEW Bioweapons OZONE Earth’s radiation 
treaty must update its shield on the mend 
science p.9 again p.l2 


Doctor, doctor 


Writing a PhD thesis is a personal and professional milestone for many researchers. But the process 


needs to change with the times. 


true but probably isn't, the average number of people who read 

a PhD thesis all the way through is 1.6. And that includes the 
author. More interesting might be the average number of PhD theses 
that the typical scientist — and reader of Nature — has read from start 
to finish. Would it reach even that (probably apocryphal) benchmark? 
What we know for sure is that the reading material keeps on coming, 
with tens of thousands of new theses typed up each year. 

To what end? Reading back over a thesis can be like opening up a 
teenage diary: a painful reminder of a younger, more naive self. The 
prose is often rough and rambling, the analyses spotted with errors, 
the methods soundly eclipsed by modern ones. And students in the 
process of writing a thesis can find themselves in a very dark place 
indeed: lost in information, overwhelmed by literature, stuck for the 
next sentence, seduced by procrastination and wondering why on 
earth they signed up to this torture at all. 

Two News Features this week reflect on that question. They 
examine the past, present and future of the PhD thesis and the oral 
examination that often accompanies it. On page 22, three leading 
scientists — including Francis Collins, director of the US National 
Institutes of Health — dig out and reread their theses for us, and talk 
about what they learned. Their musings (filmed and available in a series 
of videos at go.nature.com/297qrah) show, reassuringly, that they are 
just the same as the rest of us. They made mistakes, had moments of 
self-doubt and considered quitting. (Collins actually did quit.) But their 
stories also reveal how it is important to have the long view in mind. 

Thumbing through their theses now, they see how much they 
learned about the scientific process and how to conduct rigorous 
research. They realize how precious it was to be able to devote them- 
selves to a single piece of original and creative work. And they feel a 
sense of accomplishment and pride — as everyone tends to after any 
difficult life challenge that they struggle with and eventually conquer. 

Completing a thesis represents a coming of age not just scientifically, 
but also educationally and personally. It signals the passing of an intel- 
lectual milestone — from a student under the care of a supervisor to an 
individual who asks questions of their own. It marks the end of formal 
education, and graduation to a new phase in life. For many people, it 
also sees their departure from science altogether. Often, the PhD years 
coincide with significant personal events, as we mature emotionally 
and meet friends, partners and colleagues who will stay with us for life. 
All this can also turn thesis-writing into a more significant event than 
merely the writing up of a (usually) minor piece of science. 

Still, it’s perhaps too easy to get sentimental over the thesis. For a 
start, the process has to keep up with the times. The PhD is already 
assessed in many different ways around the world (as the second 
News Feature, on page 26, describes) and scientists should wel- 
come ways to keep it relevant. The goal of PhD assessment every- 
where remains, rightly, to demonstrate that a student has conducted, 


A ccording to one of those often-quoted statistics that should be 


and can communicate, independent, original research. But the way in 
which that’s achieved can and should be improved. 

For one thing, it doesn't have to involve a vast printed volume. A 
lot of students could do themselves, their supervisors, their examin- 
ers and their wider audience a favour by keeping it crisp and short. 

Postgraduate supervisors should stress this 


“Students at the beginning. And it’s important to make 
could do the work in the thesis available to future 
themselves and researchers by publishing or sharing the 
their audience data in some form. To contribute to the world 
afavour by beyond the author’s immediate circle, a PhD 
keeping it crisp thesis should be read and used, and not just 


serve as a shelf ornament or doorstop. 

For those inspired to go back to their own 
thesis, and those who are examining a freshly written one, it’s best to be 
kind. As long as the fundamentals are there — the question is interest- 
ing and the approach and analysis rigorous — it’s fair to forgive the 
typos and the research paths that turned out to be dead ends. A PhD 
is, after all, training in research, and to try — and fail — is a valuable 
part of that course. 

Do you know where your PhD thesis is? Dig it out and share with 
@NatureNews on Twitter using the hashtag #the3wordthesis. You 
might even bump up that average readership. = 


and short.” 


False assumptions 


US regulators must regain the upper hand in 
the approval system for stem-cell treatments. 


strict when it comes to stem-cell treatments. If not, then you 

will probably hear that message soon — patient groups, entre- 
preneurs and politicians are broadcasting it as they lobby for a change 
in the law. The Food and Drug Administration (FDA), this narrative 
asserts, is holding back effective therapies and, in the words of the most 
extreme, killing people by blocking their access to cures. 

This is false. The claim that regulation is too harsh wrongly implies 
that the FDA is holding back therapies that work. Critics point to dec- 
ades of preclinical and clinical work with stem cells and the pipelines 
of stem-cell treatments. With circular logic, they argue that, because 
the treatments have not been approved, there is something wrong with 
the approval system. 

The assumption in these accusations — that these treatments 
work — is at the heart of the problem. The FDA is right to insist that 


; ; ou may have heard that regulators in the United States are too 


7 JULY 2016 | VOL 535 | NATURE |7 


© 2016 Macmillan Publishers Limited. All rights reserved. 


| THIS WEEK | EDITORIALS 


only proper clinical trials can make that case. And the agency’s critics 
are right to point out that this process is lengthy and expensive — 
perhaps too much so. 

The proposed change in the law — the REGROW Act — would 
tackle this problem by simply doing away with the need for proper 
trials. It would effectively borrow a fast-track system that Japan 
created for stem-cell treatments and regenerative medicine. Nature 
has previously expressed concern about this system (see Nature 528, 
163-164; 2015). It is not a fit and proper model to export, chiefly 
because it grants “conditional approval” to treatments with minimal 
safety data and little attention to efficacy. 

Therapies approved under this scheme can be marketed for a 
given period — around six years — after which time the treatment 
provider must report back on whether the treatment it has been selling 
to patients was safe and effective. 

In other words, patients (who in Japan have to pay up to 30% of the 
cost even of treatments covered by national insurance) are subsidiz- 
ing clinical trials. Most of these treatments, as the history of phase III 
trials shows, will probably fail. People who took an ineffective drug 
(and probably spurned other treatments to do so) will not get their 
money back. 

Japan still has to prove that data collection under this system will 
be rigorous enough to prove a treatment's efficacy. And if the system 
works and drugs are found to be ineffective, the regulatory agency will 
then have to fight the uphill battle of reining back treatments that were 
already on the market but are now de-approved. 

Overall, Japan will most probably see a flood of safe but ineffective 
treatments. That scenario would discourage anyone from going 
through the costly steps required to create therapies that really do 
work (if you can sell garbage for the same price, why not stick with 
that?). That would be a shame for a field with such promise. Is this 


the way the United States wants to go? 

Another reason for saying that the FDA is not unduly harsh on 
restricting stem cells is the large number of clinics that already operate 
and sell unapproved treatments. A study released last week reported 
351 businesses offering stem-cell treatments at 570 clinics in the United 
States (L. Turner and P. Knoepfler Cell Stem Cell http://doi.org/bkpv; 
2016). Although the study does not directly accuse these clinics or 
businesses of wrongdoing, many of them promise stem-cell treatments 

for neurodegenerative diseases for which 


“The assumption no stem-cell treatment has so far proved 
that these effective. 

treatments work These treatments, which usually claim that 
isattheheartof — acertain type of stem cell can transform into 


another type of mature cell able to ameliorate 
such diseases, require approval by the FDA. 
The existence of these clinics shows that the FDA is not strict — never 
mind too strict — in its regulation. 

That the FDA moves so slowly to crack down on existing 
unapproved stem-cell treatments makes the prospect of conditional 
approval — an opportunity to embed ineffective treatments in the 
US health-care system — all the more worrisome. 

The best way for the FDA to respond to the mood that has seeded 
the REGROW Act is to agree on a more efficient way to approve cell 
treatments. It is working to do so, but tensions are high. A hearing 
planned for April was overwhelmed by prospective participants. It 
is now scheduled for September — stretched to two days and with a 
public workshop added. 

The FDA should strive to keep this debate on the proper 
topic — how to create a more efficient system that still scientifically 
evaluates whether treatments are safe and efficacious. To fall short 
would be a setback for science, and for patients. m 


the problem.” 


Beyond Zika 


The spotlight on Zika virus should help to spur 
broader research into birth defects. 


in the United States with a birth defect. That’s about 120,000 every 

year. For the many individuals with severe cases, childhood and 
beyond becomes a struggle with mental or physical disabilities, hos- 
pital visits and day-to-day worries. And that is in one of the world’s 
richest countries. In low- or middle-income countries, surveillance of 
birth defects is often absent or so weak that health authorities simply 
don't know the scale of the problem, making it difficult to develop 
appropriate prevention measures and care. 

The harsh realities of birth defects are shown in recent photographs 
of babies born in Brazil with abnormally small heads — a condition 
called microcephaly that seems to be linked to the mosquito-borne 
disease Zika. The threat of the Zika virus has put birth defects on the 
political and public-health agenda in a way not seen since the rubella 
virus (the cause of German measles) led to a pandemic of such defects 
in the mid-1960s. 

Zika therefore provides an opportunity to greatly raise awareness of 
birth defects, and to bolster support for research and improved public- 
health action on their many other preventable causes. Researchers 
must urgently make this case to funders and their political paymasters 
before the flurry over Zika inevitably ebbs (see page 17). 

One target should be the eradication of rubella. It is a scandal that, 
worldwide, some 100,000 babies are born annually with congenital 
rubella, despite the availability of a cheap and effective vaccine. The 
virus spreads slowly and is a low-hanging fruit for eradication through 


I: the time it takes you to read this article, a baby will be delivered 


8 | NATURE | VOL 535 | 7 JULY 2016 


accelerating vaccination in poorer countries. 

Another easy target is the compulsory addition of folate vitamins 
to food staples to protect against neural-tube defects, such as spinal 
bifida, in developing fetuses. Despite a wealth of evidence that com- 
pulsory fortification works, as well as its adoption in the United States, 
most countries (including all European ones) have yet to follow suit. 

The longer-term challenge is to develop the research infrastructure 
needed to find and prevent the causes of birth defects — in particular 
because a whopping three-quarters of occurrences have no identi- 
fied cause. Some will prove to be random events, and others will have 
genetic or multifactorial origins, but it is likely that many are down to 
environmental or infectious exposures that public-health authorities 
can do something about. 

This sort of research requires long-term commitment and invest- 
ment, and the nurturing of highly specialized research communities. 
Of all the types of epidemiological research, studies of birth defects 
are perhaps the most difficult. Although their combined human and 
public-health impact is enormous, individual congenital abnormali- 
ties are relatively rare in comparison with, say, lung disease. This 
means that population-scale databases are needed to capture and 
record birth defects, and to achieve adequate statistical power. 

Amid the political climate of Brexit, there is a certain irony that 
one of the most developed surveillance systems for birth defects, the 
European Surveillance of Congenital Anomalies (EUROCAT), was 
conceived with far-sighted vision in 1974 by the then European Eco- 
nomic Community in the wake of the tragedies of rubella and the 
drug thalidomide. Such registries may seem mundane, but they are 
crucial if we are to underpin exploration of the causes and risk fac- 
tors of congenital anomalies and to provide an early-warning system 
for new causes of birth defects. 

Birth defects should be a top public-health priority to protect the 
youngest and most vulnerable members of our society. It is staggering 
in 2016 that they are not. m 


© 2016 Macmillan Publishers Limited. All rights reserved. 


UNIV. BRADFORD 


WORLD VIEW .ecnsicor sen 


treaty that outlaws the use of biological weapons. The 1972 Biologi- 

cal Weapons Convention (BWC) was the first agreement to ban 
an entire class of weapons, and it remains a crucial instrument to stop 
scientific research on viruses, bacteria and toxins from being diverted 
into military programmes. 

The BWCis the best route to ensure that nations take the biological- 
weapons threat seriously. Most countries have struggled to develop 
and introduce strong and effective national programmes — witness 
the difficulty the United States had in agreeing what oversight system 
should be applied to gain-of-function experiments that created more- 
dangerous lab-grown versions of common pathogens. 

As scientific work advances — the CRISPR 
gene-editing system has been flagged as the latest 
example of possible dual-use technology — this 
treaty needs to be regularly updated. This is 
especially important because it has no formal 
verification system. Proposals for declarations, 
monitoring visits and inspections were vetoed 
by the United States in 2001, on the grounds that 
such verification threatened national security 
and confidential business information. 

The treaty therefore relies on countries con- 

verting its prohibitions into national law, and 
setting up proper regulations and oversight. But 
there is a problem with the way that the BWC is 
set up to receive and process scientific advice, 
which affects the ability to update it efficiently. 
Next month’s meeting must address this prob- 
lem, and scientists who care about the societal 
impacts of research should lobby their elected 
politicians to make sure that it does. 

The BWC is formally reviewed every five years at a special conference 
(the next is in November this year, and the Swiss August meeting in is to 
prepare for it). During the intervening years, annual one-week meetings 
of government experts, and later of government representatives (state 
parties) are intended to track progress and raise issues. But there is not 
enough time at these meetings to discuss what is needed in sufficient 
depth. So no properly thought-out recommendations can be made. 

In 2013, for instance, the experts’ meeting scheduled a mere six 
hours of discussions on science and technology — less than a day. That 
is not enough time for complex science to be presented, digested and 
discussed, and not enough to consider its implications and suggest 
revisions to the BWC. 

There is widespread awareness that the current system is not fit 
for purpose. At a preparatory meeting in April, 5 of the 13 working 
papers dealt with the need to find a better way to carry out these 
crucial interim discussions on science and technology. As the Rus- 
sian paper noted: “There is widespread agreement that improved 


IE Geneva next month, officials will discuss updates to the global 


STRONG 
INTERNATIONAL 
ACTION IS NEEDED TO 
ASSESS THE 


THREATS 
FROM THE NEW 
AGE OF BIOLOGICAL 


TECHNIQUES. 


Find the time to discuss 
new bioweapons 


The Biological Weapons Convention needs to take the assessment of emerging 
scientific dangers more seriously, argues Malcolm Dando. 


and more effective arrangements are required.” 

Other international agreements have effective ways to track and deal 
with scientific and technological change. The 1997 Chemical Weap- 
ons Convention has a permanent scientific advisory board. When 
concerns were raised in 2011 about the possible harmful implica- 
tions of the convergence of chemistry and biology, that board set up 
a dedicated working group to investigate and report back. It did so 
in 2014 — concluding that the current threat was low but that future 
developments should be monitored closely. The assessment system led 
to action. At present, the BWC assessment system cannot. 

In the long term, the BWC may need a similar advisory board for sci- 
ence. But that is unlikely to happen soon, andas science is rapidly chang- 
ing, we have to find a way to improve the way the 
interim annual meetings work. My colleague 
Kathryn Nixdorff and I interviewed delegates at 
previous meetings about possible improvements, 
and we have some simple suggestions. 

The discussions of science at the experts 
meetings should be split off into a separate dedi- 
cated parallel track. This is the best way to create 
the necessary time. Even then, it will be imprac- 
tical to cover all relevant ground across the 
sciences, so each year a specific topic — CRISPR 
editing, say — should be considered. Researchers 
and scientific bodies should present the 
facts, and then discuss the implications with 
government officials at the experts’ meeting. 
Who should attend these sessions? We argue that 
they should be open to representatives from any 
member state. 

Feeding back results of these expert discus- 
sions to the broader BWC, a designated diplomat — in place for the 
full five-year period between review conferences — would attend the 
annual experts’ meetings and write a report. The annual meetings of state 
parties should then assess these reports and agree any action needed. 
Future review conferences should check on progress. 

Even so, issues such as the possible dual-use threat from gene- 
editing systems will not be easily resolved. But we have to try. Without 
the involvement of the BWC, codes of conduct and oversight systems set 
up at national level are unlikely to be effective. The stakes are high, and 
after years of fumbling, we need strong international action to monitor 
and assess the threats from the new age of biological techniques. 

If the BWC cannot find a way to adapt to the pace of scientific and 
technological change, then it risks becoming irrelevant as the world 
searches for biosecurity in the coming decades. m 


Malcolm Dando is professor of international security at the 
University of Bradford, UK. 
e-mail: m.r.dando@bradford.ac.uk 


7 JULY 2016 | VOL 535 | NATURE | 9 


© 2016 Macmillan Publishers Limited. All rights reserved. 


Selections from the 
scientific literature 


RESEARCH HIGHLIGHTS 


Reward system 
boosts immunity 


Activating the reward system 
in the brains of mice directly 
boosts their immune systems, 
offering a physiological 
explanation for the placebo 
effect. 

Shai Shen-Orr, Asya Rolls 
and their colleagues at the 
Technion-Israel Institute of 
Technology in Haifa activated 
neurons ina part of the 
mouse brain that processes 
rewarding activities such as 
eating and sex. The next day, 
they injected the mice with 
the bacterium Escherichia 
coli. The animals showed 
increases in both short-term 
and long-term immune 
responses to the pathogen, 
compared with mice ina 
control group. But these effects 
were lost when the researchers 
also inactivated the animals’ 
sympathetic nervous systems, 
suggesting that this system 
helps to mediate interactions 
between the brain and the 
immune system. 

Nature Med. http://dx.doi. 
org/10.1038/nm.4133 (2016) 


ASTROPHYSICS 


No neutrinos from 
black hole smash 


The first hunt for neutrinos 
coming from the merger of 
two black holes — which last 
year produced the first direct 
detection of gravitational 
waves — has come up empty. 
Imre Bartos at Columbia 
University in New York 
and his colleagues analysed 
data from two neutrino 
detectors: ANTARES, under 
the Mediterranean Sea, and 
IceCube at the South Pole. 
They found that no neutrinos 
were detected at ANTARES in 
the 500 seconds before or after 
the black holes collided, and 


ZOOLOGY 


Wind powers weeks of non-stop flight 


Frigate birds use the power of the wind and rising 
air to stay airborne for many weeks at a time. 
Henri Weimerskirch at the CNRS Centre 
for Biological Studies in Chizé, France, and his 
colleagues fitted great frigate birds (Fregata 
minor), with devices to track their movements 
over the Indian Ocean. Some birds were also 
fitted with devices to measure their heart rate 


and acceleration. 


The researchers found that the birds stayed 
on the wing for up to 48 days and travelled an 
average of 450 kilometres daily, often tracking 


that just three were detected 
at IceCube — none of which 
came from the direction of 
the event. 

The scarcity of neutrinos 
from the collision puts an 
upper limit on how much 
energy it could have radiated 
through the near-massless 
particles, say the authors. 

If researchers can find a 
signal from a black-hole 
collision in the future, they 
could use the relatively high 
spatial resolution of neutrino 
telescopes to pinpoint its 
location. 

Phys. Rev. D 93, 122010 (2016) 


10 | NATURE | VOL 535 | 7 JULY 2016 


the wind around the edge of the huge area of 
low pressure called the doldrums. 

The birds use a “roller-coaster flight’, the 
authors say, ascending up to 4,000 metres with 
the help of the wind and thermals. Frigates 
cannot land on the water, but they can glide 
over distances of many kilometres in a low- 
energy mode — sometimes with no flapping at 


all. This may provide them with the opportunity 


EVOLUTION 


Lizards tailor tails 
to local predators 


Brightly coloured tails are a 
common feature of young 
lizards, and can be tailored to 
the eyesight of specific local 
predators. 

Takeo Kuriyama and his 
colleagues at Toho University 
in Funabashi, Japan, collected 
15 juvenile Plestiodon 
latiscutatus lizards from three 
areas of Japan dominated by 
different predators — snakes, 
weasels or birds. Lizards’ tails 


© 2016 Macmillan Publishers Limited. All rights reserved. 


to nap for up to 12 minutes at a time, and allow 
them to stay in the air almost indefinitely. 
Science 353, 74-78 (2016) 


were vivid blue where weasels 
or snakes were common, 

but had high ultraviolet 
reflectance only in areas high 
in snakes. Weasels can see 

blue wavelengths, but, unlike 
snakes, cannot detect UV light, 
suggesting that the lizards have 
evolved to draw the attention 
of specific local predator 
species away from their bodies 
and towards their disposable 
tails. Brown tails were found 

in the area where keen- 

eyed predatory birds make 
camouflage a better strategy. 

J. Zool. http://doi.org/bkqm 
(2016) 


H. WEIMERSKIRCH/CEBC, CNRS 


ESA/DLR/FU BERLIN (G. NEUKUM) 


WILEY 


Martian moons 
formed in situ 


The moons of Mars may have 
formed from a disk of debris 
kicked up by the impact ofa 
giant meteorite on the planet. 
Astronomers have struggled 
to explain the existence 
of Phobos (pictured) and 
Deimos, the small, irregularly 
shaped moons of the red 
planet. One view is that they 
were asteroids captured 
by Mars. But a team led by 
Pascal Rosenblatt at the Royal 
Observatory of Belgium in 
Brussels tested an alternative 
idea using computer 
simulations of how orbiting 
debris, created by a giant 
impact, might coalesce. 
Relatively large moons form 
in the inner part of the disk 
thrown up by such a smash, 
and migrate outward, causing 
the outer part of the disk to 
coalesce into two bodies the 
sizes of Phobos and Deimos. 
The inner large moons are 
eventually dragged inward and 
fall back to Mars over 5 million 
years. 
Nature Geosci. http://dx.doi. 
org/10.1038/nge02742 (2016) 


Warming shifts 
plant sex ratio 


Climate change seems to be 
skewing the sex ratios of an 
alpine herb towards male 
plants. 

William Petry at 
the University of 
California, Irvine, 
and his colleagues 
analysed data on 


populations of the herb 
valerian (Valeriana edulis) 

in the Rocky Mountains of 
Colorado as the region became 
warmer and drier over the 
past few decades. They found 
that in 2011, plants at the 
highest elevations were only 
23% male, whereas at lower 
altitudes, where the climate is 
warmer and wetter, the plants 
were up to 50% male. Across 
9 populations at a variety 

of elevations, there was an 
average of 6% more males in 
2011 than in 1978. 

A higher male-to-female 
ratio could result in increased 
pollination — and therefore 
seed production — which 
could help the plants to 
expand their range as they 
adapt to climate change, the 
authors suggest. 

Science 353, 69-71 (2016) 


Soft wheels make 
robots tough 


Wheels built entirely from 
soft materials can help robots 
to roll over tricky terrain and 
resist damage. 

Aaron Mazzeo and his 
co-workers at Rutgers 
University in Piscataway, 
New Jersey, built a squishy 
wheel inspired by the inching 
motions of soft creatures such 
as earthworms. A stretchable 
ring contains multiple 
internal chambers, groups 
of which can be inflated and 
deflated sequentially around 
the circle. The pressurized 
compartments exert torque on 
a second, outer ring, causing it 
to turn. 

A soft robotic vehicle fitted 
with four of these wheels 
(pictured) travelled on a flat 
surface at 3.7 centimetres 
per second and kept moving 
after being dropped from 
eight times its height. The 
researchers also drove the 


RESEARCH HIGHLIGHTS BiiSaiaa¢ 


SOCIAL SELECTION 


Fake article webpages draw fire 


A debate is swirling around a tactic that academic publisher 
John Wiley & Sons uses to fight online piracy (see go.nature. 
com/299xily). The company created a webpage, accessible 
by several URLs, that looked like an academic paper to 
automated downloading programs. But any users who 
accessed the URLs were then blocked from viewing other Wiley 
content. Wiley and other publishers use these ‘trap’ URLs — 
which are invisible to most human users 


> NATURE.COM 

For more on 

popular papers: 
go.nature.com/29hhqog 


robot over a rocky landscape 
and underwater, and show 
that their concept can be 
modified to make winch 
rotors. 

Adv. Mater. http://doi.org/f3qjsh 
(2016) 


Leukaemia cells 
hide in fat tissue 


Cancer-causing stem cells 
evade chemotherapy by 
surviving in fat deposits 
around gonads. 

Fat tissue supports the 
growth of normal blood- 
forming stem cells. Craig 
Jordan of the University of 
Colorado Denver and his 
colleagues found that ina 
mouse model of one form 
of leukaemia, gonadal fat 
deposits contained numerous 
cancer stem cells, but 
subcutaneous fat had very 
few. Leukaemic cells induced 
the breakdown of gonadal fat, 
releasing nutrients that fuelled 
the growth of malignant 
cells in fat as well as other 
tissues. The cancer stem 
cells also expressed CD36, a 
cellular marker that boosts 
fat metabolism, helping to 
protect the cells from many 
chemotherapy drugs. 
Targeting fat metabolism 

could help to eradicate 
leukaemia stem cells, 
the authors suggest. 

Cell Stem Cell http://doi. 

org/bkqj (2016) 


— to detect and prevent unauthorized 
downloading and republishing of their 
content. But some users say that the 
tactic is too heavy-handed. 


Negative carbon 
emissions needed 


Countries’ existing promises 
regarding emissions 
reductions are unlikely to 
prevent global warming 
exceeding 2 °C above pre- 
industrial temperatures by the 
end of the century, meaning 
that large amounts of carbon 
may need to be removed from 
the atmosphere. 

Benjamin Sanderson 
and his co-workers at the 
US National Center for 
Atmospheric Research in 
Boulder, Colorado, explored 
the odds of staying below 
2°C of warming for a range of 
emissions pathways. They also 
analysed whether ‘negative 
emissions’ — the removal of 
carbon from the atmosphere 
— will be necessary. 

To avoid crossing the 
2-degree threshold during this 
century, net global emissions 
must drop to zero by 2085, 
the authors find. Depending 
on the level of near-term 
reductions, between 1.5 billion 
and 5 billion tonnes of CO, per 
year will need to be captured 
and removed from the 
atmosphere thereafter. 
Geophys. Res. Lett. http://doi. 
org/bkqh (2016) 


> NATURE.COM 

For the latest research published by 
Nature visit: 
www.nature.com/latestresearch 


7 JULY 2016 | VOL 535 | NATURE | 11 


© 2016 Macmillan Publishers Limited. All rights reserved. 


SEVEN DAYS nscsns 


| __BUSINESS 
Volkswagen to pay 


Volkswagen has made a 
US$14.7-billion settlement 
with regulators in the United 
States over its manipulation of 
emissions tests for the diesel 
engines used in almost half 

a million cars on US roads. 
The world’s biggest carmaker 
announced on 28 June that 

it would set aside $10 billion 
to buy back or terminate the 
leases on affected Volkswagen 
and Audi models made from 
2009 to 2015, or to modify 
the vehicles to reduce their 
emissions. It agreed to spend 
$2.7 billion on cleaning up 
environmental pollution and 
to invest an extra $2 billion 
in clean car technology in the 
United States over the next 
10 years. 


EVENTS 


Laureates on GM 
More than 100 Nobel laureates 
have signed an open letter 
urging environmental 

group Greenpeace to 

stop campaigning against 
genetically modified (GM) 
organisms. “Opposition 

based on emotion and dogma 
contradicted by data must be 
stopped,’ they say. The 29 June 
letter cites, in particular, 
campaigns against the 
vitamin-A-enriched ‘golden 
rice, which the laureates say 
could reduce disease-causing 
deficiencies of the nutrient. 
Greenpeace responded on 

30 June, saying that golden rice 
had failed as a solution and 
suggesting that malnutrition 
should be addressed through 
“diverse diet, equitable access 
to food and eco-agriculture’”. 


Brexit fallout 
Scientists in the United 
Kingdom have continued 
to voice concerns about 
the impact of the country’s 
23 June vote to leave the 


September 2000 


<< = | 
0 200 400 600 
Total ozone (Dobson units) 


September 2014 


Ozone hole shows signs of healing 


The Antarctic ozone hole is on the mend, 
according to an analysis published on 30 June 
(S. Solomon et al. Science http://doi.org/ 

bkn5; 2016). The hole — which opens in the 
stratosphere every Antarctic spring — has 
shrunk in its September extent (pictured) by an 
average of 4.5 million square kilometres since 
2000. The work used balloon observations 

and climate models to confirm that ozone 

is recovering thanks to the 1987 Montreal 


European Union, amid fears 
over access to EU funding 

and the employment rights 

of academics. High-profile 
figures including Paul Nurse, 
director of the Francis Crick 
Institute in London, have 
demanded that science be 
given a seat at the table during 
exit negotiations. An inquiry 
into the implications of the vote 
has been started by politicians 
from the House of Commons 
Science and Technology 
Committee. And science 
minister Jo Johnson attempted 
to reassure researchers ina 
speech in London on 30 June 
about ongoing plans to reform 
the country’s research-funding 
system. He noted that the 
United Kingdom is still open 


12 | NATURE | VOL 535 | 7 JULY 2016 


to all EU researchers. See 
go.nature.com/29v5m8n for 
more. 


Chemist killed 


An ‘anarcho-primitivist group 
has claimed responsibility 

for the killing of José Jaime 
Barrera Moreno, a chemist 

at the National Autonomous 
University of Mexico (UNAM) 
who was stabbed to death 

at the university in Mexico 
City on 27 June. In a message 
posted online on 29 June, a 
group calling itself Individuals 
Tending Towards Savagery 
(ITS) says that it carried out 
the attack. ITS is an alliance 

of eco-extremist groups that 
also claimed the 2011 shooting 
of another UNAM scientist, 


© 2016 Macmillan Publishers Limited. All rights reserved. 


Protocol, which banned ozone-depleting 
chlorofluorocarbon chemicals. Fingerprints 
of healing are most obvious in September, the 
month the hole begins to grow in earnest. An 
exception to the healing trend occurred in 
2015, when a record hole opened because of 
a volcanic eruption in Chile. Sulfur particles 
from the event temporarily accelerated the 
ozone-destroying reactions. See go.nature. 
com/29ewicm for more. 


biotechnologist Ernesto 
Méndez Salinas. The group has 
repeatedly attacked scientists 
and technologists, whom it 
blames for destroying nature. 


Dawn stays put 

On 1 July, NASA turned down 
a request to send its Dawn 
spacecraft, currently orbiting 
the asteroid Ceres, to another 
destination. Team leaders had 
wanted to send the mission, 
which has been at Ceres since 
March 2015, to the asteroid 
Adeona. NASA instead opted 
to keep Dawn where it is, to 
study Ceres as it gets closer to 
the Sun. The agency approved 
mission extensions for all of the 


NASA OZONE WATCH 


NASA/JPL-CALTECH 


SOURCE: L. TURNER & P. KNOEPFLER CELL STEM CELL HTTP://DOI.ORG/BKPV ( 2016) 


planetary spacecraft that it was 
considering them for, including 
New Horizons, which after 

a successful visit to Pluto last 
year is now on its way to 2014 
MU69, an object in the Kuiper 
belt. It is expected to fly past the 
icy world in January 2019. 


Rosetta’s final move 


The Rosetta spacecraft will 
crash land on comet 67P/ 
Churyumov-Gerasimenko 

on 30 September, two years 
after reaching it, the European 
Space Agency announced on 
30 June. In August, engineers 
will put Rosetta in a series 

of flattening elliptical orbits, 
ahead ofa final manoeuvre 

12 hours before impact. In its 
final descent, Rosetta will get 
its closest look yet at the comet. 
Despite a relatively soft landing, 
at a speed of 50 centimetres per 
second, the impact will bring 
the mission to a close: Rosetta’s 
transmitter will be switched off 
and its antenna will no longer 
be able to point to Earth. 


Juno finds Jupiter 
NASA’ Juno spacecraft slipped 
into orbit around Jupiter on 

5 July, becoming the first craft 
to visit the giant planet since 
the Galileo mission arrived 

in 1995. The probe (artist's 
impression, pictured) fired 
its main engine flawlessly in 

a 35-minute burn that sent 

it looping around the planet 
into a 53.5-day orbit; later 

this period will be reduced to 


TREND WATCH 


Hundreds of clinics in the United 


States are offering unapproved 
stem-cell treatments for a range 


of medical conditions. A rigorous 


search and analysis of Internet 


advertising, published on 30 June, 
found that 351 US businesses were 


marketing such treatments at 


570 clinics for conditions including 


pain and autism (L. Turner and 


P. Knoepfler Cell Stem Cell http:// 
doi.org/bkpv; 2016). Most clinics 
offered treatments with stem cells 
derived from the patient; other 
sources for the cells included 
amniotic and placental material. 


14 days. The US$1.1-billion 
project aims to explore basic 
questions such as what Jupiter 
is made of and whether it has a 
core. See go.nature.com/29icaol 
for more. 


EU trawling ban 


A deal has been struck to 
limit deep-sea trawling in 
European Union waters, 
ending a long-running battle 
between non-governmental 
organizations, researchers, 
politicians and the fishing 
industry. On 30 June, 
ministers and Members of 
the European Parliament 
announced that all trawling 
below 800 metres will be 
banned, and fishing for deep- 
sea species above this depth 
will be permitted only in areas 
where it took place between 
2009 and 2011. Conservation 
groups such as BLOOM in 
Paris and some scientists have 
long argued that deep-sea 


trawling destroys sensitive 
marine ecosystems that may 
take years to recover, if they 
recover at all. 


Pesticide ruling 


The European Commission has 
extended authorization of the 
use of the weedkiller glyphosate 
in the European Union for 

up to 18 months. The 29 June 
decision, which is pending 
completion ofa risk assessment 
by the European Chemicals 
Agency, came a day before the 
previous authorization was 

due to expire. It followed the 
failure of EU member states 

to agree, with the necessary 
qualified majority, on whether 
glyphosate should be approved 
for a further 15 years — or 
banned. Critics fear that the 
chemical causes cancer, but 
many experts say that it is safe. 


Clean-energy plan 
North America is to get half 
of its electricity supply from 
non-fossil fuels and renewable 


STEM-CELL THERAPIES BRANCH OUT 


Unapproved stem-cell treatments are being marketed in the United 
States for medical conditions as well as for cosmetic enhancement. 


Sexual 
Cardiovascular 
Skin 

Ageing 


B Orthopaedic 
6 Pain 
C= 

3 Sports 
8 Neurological 
3 

B Immune 
8 Respiratory 
& Urological 
> ‘ 
2 Cosmetic 
Oo 

= 

ion 

2 

= 

+ 

n 

fe} 

= 


Advertised procedures : 
; include stem-cell ‘facelift’ § 
: and ‘breast augmentati : 


0 50 ~=100 
Number of businesses 


150 200 250 300 350 


SEVEN DAYS | THIS WEEK | 


energy sources by 2025, the 
leaders of Canada, the United 
States and Mexico pledged at 
a meeting last week in Ottawa. 
In their 29 June statement 
ona North American 
Climate, Clean Energy, and 
Environment Partnership, 
Canadian Prime Minister 
Justin Trudeau, US President 
Barack Obama and Mexican 
President Enrique Pefia Nieto 
also agreed to reduce methane 
emissions, improve energy 
efficiency and advance carbon 
capture and storage technology 
in their countries. 


Animal use ends 


Medical schools in the United 
States and Canada have ceased 
using live animals to teach 
students. The University of 
Tennessee College of Medicine 
in Chattanooga was the last 
US medical school that still 
used animals for this purpose, 
and on 26 June, the university 
told the Physicians Committee 
for Responsible Medicine 
(PCRM) that it was ending 

the practice. According to the 
PCRM, medical schools believe 
that simulators provide better 
education as they are based on 
the human body. 


Harassment case 


Michael Katze, a virologist 
known for his research on 
Ebola at the University of 
Washington in Seattle, has 
been suspended from his lab 
following violations of sexual- 
harassment policy. In a 29 June 
statement, the university said 
that it is discussing disciplinary 
measures. The university's two 
investigations found that Katze 
routinely made inappropriate 
comments to female employees 
and had a quid-pro-quo sexual 
relationship with a lab worker. 
The university also determined 
that Katze misused public 
resources by instructing his 
employees to perform personal 
chores for him, including 
soliciting sexual partners for 
him online. 


> NATURE.COM 
For daily news updates see: 
www.nature.com/news 


7 JULY 2016 | VOL 535 | NATURE | 13 


© 2016 Macmillan Publishers Limited. All rights reserved. 


GREG KENDALL-BALL/ NATURE 


NEWSIN FOCUS 


Zika spotlights Amazon rainforest Small, cheap Isit time to 
more common primed for record fire “CubeSats’ await modernize the PhD 
birth-defect virus p.17 season p.18 inter-planetary ride p.19 thesis? p.26 


POLITICS | = 


John Holdren looks back 


Obama’s top scientist talks to Nature about shrinking federal budgets, Donald Trump, 
and his biggest regret after nearly eight years inthe White House. 


BY JEFF TOLLEFSON AND SARA REARDON 


Over his long career in science, Holdren 

— a physicist by training — has worked 
on controversial high-profile issues such as 
climate change and nuclear non-proliferation. 

But for nearly eight years, he has enjoyed an 
even higher profile, as US President Barack 
Obama’s science adviser, and director of the 
White House Office of Science and Technol- 
ogy Policy (OSTP). 

With Obama due to leave the White 
House in January 2017, Holdren, now the 


Je- Holdren is no stranger to the spotlight. 


longest-serving US science adviser, recently 
sat down with Nature for a wide-ranging chat. 
The interview has been edited for length and 
clarity. 


Opinion polls continue to show a divide 
between what the American public thinks 
about science and what scientists think. Has 
Obama done enough to change the way that 
science is perceived? 

The president has done an incredible job in 
making science cool for young people. This is 
already evident in all kinds of numbers: you see 
more kids enrolling in science courses, more 


© 2016 Macmillan Publishers Limited. All rights reserved. 


kids participating in science fairs, more kids 
going to ‘makerspaces. We have substantially 
increased the number of engineers graduating 
from college in this country. I say ‘we; but obvi- 
ously, that is a large cooperative operation that 
includes colleges and universities. 

Tm not sure which polls you are referring to, 
but my impression is that the public is more 
interested in and enthusiastic about science, 
technology and innovation than it was at the 
beginning of this administration. 


Leaders at the National Institutes of Health 
(NIH) and other government agencies have 


7 JULY 2016 | VOL 535 | NATURE | 15 


NASA/JPL 


| NEWS IN FOCUS 


> discussed the widespread 
perception that we are training 
too many PhDs. Do you worry 
about that? 

If every PhD we train believes 
that her or his only acceptable 
career trajectory is a tenured pro- 
fessorship in a college or univer- 
sity, then it’s true: we are training 
more PhDs than there are slots 
of that kind. But the PhD is, in 
fact, a very versatile degree. Far 
more than just demonstrating 
that you know more than prac- 
tically anybody else about one 
narrow topic, it demonstrates 
that you have the fortitude, the 
focus, the commitment and the 
intellectual capacity to tackle a 
very tough problem. 

PhDs are finding construc- 
tive and rewarding employment 
all across the economy, and, overall, our view 
is that there are still more opportunities for 
highly trained people in science, technology 
and innovation than there are people being 
trained. 


Do you worry about future science funding? 
The president has consistently recommended 
more money for science and technology than 
Congress has been willing to pass. 

The success ratio of proposals to the NIH is 
something like 17% — that is, we are funding 
one-sixth of the proposals that the NIH gets. 
And those proposals are already self-selected. 
Investigators don't bother writing a proposal 
to the NIH unless they think they have got a 
really good idea, a capable team and a plausible 
strategy. If you ask Francis Collins, the NIH 
director, what fraction of the proposals they get 
that are worthy of funding, he'll tell you 50%. 

That means we are funding about a third of 
the potentially productive, influential, path- 
breaking research that is proposed to the NIH. 
But the NIH has a budget of over US$30 billion 
per year. It’s not very easy in these budget times 
to increase a $30-billion budget by a large fac- 
tor, like 50% — never mind 100% or more, 
as director Collins would say is warranted in 
terms of the quality of the research. The same is 
true at the National Science Foundation — far 
more worthy proposals than they are able to 
fund. This is a consistent problem. I would like 


>) 


MORE 
ONLINE 


16 | NATURE | VOL 535 | 7 JULY 2016 


Holdren and Obama have pushed for bigger science budgets — with mixed results. 


to see more public support for raising public 
spending on research and development. 


Science is global today. How do you think 

that complicates matters? Can the regulators 
keep up? 

I’m going to China this week for a strategic 
and economic dialogue and for a US-China 
dialogue on innovation policy. I'll be talking 
with my Chinese counterpart, Wan Gang, 
the minister of science and technology, about 
some of these very problems and what we are 
doing about them. 

We have a lot of cooperation with China 
on biomedical issues. We talk to them all of 
the time about gain-of-function research 
and about gene-editing issues. And in fact, 
when the current round of interest in gene 
editing emerged with the rise of the CRISPR 
technology, the [US] National Academies of 
Sciences, Engineering, and Medicine gathered 
leading scientists from 
all over the world in a 
format very much like 
Asilomar [a landmark 
conference in 1975 that 
set rules for research on 


> NATURE.COM recombinant DNA], but 
For an extended strongly international. 
version of this The top Chinese people 
interview, see: came to talk through 
go.nature.com/29mxyuj what the implications 


NASA's @ Biomedical researchers lax about 
Juno probe | validating antibodies for experiments 
enters go.nature.com/29Imswx 

Jupiter’s @ Antarctic ozone hole is on the mend 
orbit go.nature.com/29ewicl 

go.nature. @ Background does not impact UK 


com/29|mrmb 


© 2016 Macmillan Publishers Limited. All rights reserved. 


scientists’ pay go.nature.com/29jpnji 


of these technologies are, and = 
how we should think as a global 
science community about regu- 
lating them. 


Shortly after he took office, 
Obama said that this was going 
to be the most transparent 
administration ever. But 
journalists have found some 
agencies to be fairly opaque. 

In the first months of the 
administration, the president 
issued executive orders on trans- 
parency, on scientific integrity, 
on openness in government. I 
was put in charge of a number 
of the implementation [efforts]. 
That has been a focus of OSTP 
throughout this administration. 
We've gotten virtually all of the 
departments and agencies to 
produce for public review and comment, and 
then to finalize, policies on openness and on 
scientific integrity. I think we've made great 
progress in terms of open data, in terms of the 
publication in open venues of federally funded 
research. But I would not argue that that job 
is finished. 

There is always a tendency in government, 
some of it quite legitimate, not to expose inter- 
nal deliberations prematurely. You know, it’s 
quite challenging to have a discussion between 
the president's senior advisers with reporters 
from Nature, Science and The New York Times 
sitting around in the room, because if you do 
that, nobody will float a trial balloon for fear 
that the trial balloon will get into the news as 
a done deal. 


BRENDAN SMIALOWSKI/AFP/GETT 


You’ve spent almost eight years inside what 

is arguably the most powerful institution 

on Earth. Do you come away more or less 
optimistic about humanity’s ability to deal 
with its problems? 

I come away more optimistic, and that’s in 
large measure due to the extraordinary leader- 
ship that President Obama has provided. Ihave 
felt for many decades that science, technology 
and innovation are crucial if human society is 
to get its arms around the biggest challenges 
we face. And I’ve had the pleasure of working 
for a president for nearly eight years now who 
shares that view. m 


PETE SOUZA 


NATURE PODCAST 


Landscapes and 
nature; the Hitomi 
observatory’s 
swan song; and 
reforming peer 
review nature.com/ 
nature/podcast 


LOU COLLECTION/ALAMY 


This 22-year-old is severely disabled by fetal cytomegalovirus infection and cannot communicate verbally. 


PUBLIC HEALTH 


Zika raises wider 
birth-defect issue 


Cytomegalovirus is a greater global problem than Zika. 


BY DECLAN BUTLER 


virus is killing hundreds of babies 
A: the United States each year, and 

leaving thousands with debilitating 
birth defects, including abnormally small 
heads and brains. This is not the Zika virus. 
It is a common and much less exotic one: 
cytomegalovirus (CMV). 

Now, as the eyes of the media and health offi- 
cials focus on the spread of Zika in the Americas 
and beyond, many researchers and advocates 
hope that funders and health agencies will at 
last pay more attention to a much greater global 
problem — the millions of babies born year in, 
year out, with often-serious birth defects. 

“Birth defects are not high on the public- 
health agenda,’ says Stanley Plotkin, a retired 
scientist who in the 1970s developed the cur- 
rent vaccine against rubella (German measles). 
A 1960s rubella pandemic caused tens of thou- 
sands of birth defects in the United States alone. 

“Zika is an opportunity,” he says — to raise 
the profile of birth defects among research 
funders and public-health agencies, and to 
accelerate efforts to develop a CMV vaccine. 

The World Health Organization (WHO) 
estimates that, annually, more than a quar- 
ter of a million babies worldwide die shortly 
after birth from congenital anomalies, and 
many more are born with serious defects. The 


causes are many — some known, some not. A 
global focus on reducing child mortality has 
meant that severe disabilities in children are 
a lower public-health priority, says Anita Kar, 
a specialist in congenital abnormalities at the 
University of Pune in India. 

CMV isa poster child for the problem — 
and with Zika so much in the news, scientists 
and advocacy groups are voicing frustra- 
tion and trying to seize the moment. The US 
National CMV Foundation is running infor- 
mation campaigns comparing and contrasting 
Zika with CMV. It is lobbying politicians to 
build on the mandates enacted in several states 
for public-health authorities to produce out- 
reach material, including billboards on sides 
of buses, and to do CMV tests for all infants 
with hearing difficulties. 

“Zika has become a way to open up conver- 
sations about CMV,’ says Janelle Greenlee, a 
co-founder of the CMV foundation, who lost 
one daughter to congenital CMV and has 
another, the daughter's twin, with serious hear- 
ing loss and cerebral palsy. 

CMV infections in adults, children and 
infants are mostly asymptomatic and harm- 
less, but the virus is much more dangerous — 
often lethal — to the fetus. Worldwide, around 
1 in 100 to 500 babies are born with congenital 
CMV, and of the 10-20% who show symptoms, 
about 30% will die. Survivors often have liver, 


© 2016 Macmillan Publishers Limited. All rights reserved. 


IN FOCUS 


lung or spleen damage, or neurological prob- 
lems including developmental disability or loss 
of hearing or sight. 

CMV’s link to birth defects has been known 
since the 1950s — yet a 2012 survey found that 
only 13% of US women and 7% of men had 
heard of congenital CMV (M. J. Cannon et al. 
Prev. Med. 54, 351-357; 2012). Low aware- 
ness is deadly, says Gail Harrison, an infec- 
tious-disease researcher in CMV at the Baylor 
College of Medicine and the Texas Children’s 
Hospital, both in Houston. 

There is no vaccine, so precautions — hand- 
washing and avoiding contact with children’s 
saliva and urine — are the only defence. Har- 
rison works closely with patient groups to 
promote awareness, but says that she struggles 
with the inertia of state and federal agencies in 
helping to get these messages across. 

The administration of US President Barack 
Obama has requested more than US$1 billion 
for research and control measures for Zika, and 
the website of the US Centers for Disease Con- 
trol and Prevention (CDC) is awash with infor- 
mation and advice on that virus, she notes. But 
the more modest amount of information on 
CMV has to be actively searched for. 

Leading health experts and the CDC expect 
that Zika in the United States will be limited to 
small, localized outbreaks in southern states 
where Aedes aegypti, the mosquito that trans- 
mits the virus, is present during warm parts 
of the year. That prediction is based on the 
pattern of past US outbreaks of dengue and 
chikungunya, two other diseases carried by 
the same mosquito. For the United States, says 
Plotkin, “there is little doubt that CMV is a big- 
ger problem than Zika” 

Contributors to birth defects include genetic 
abnormalities as well as many more preventable 
factors, such as infectious diseases, medications, 
diet and environmental chemicals. But the 
causes of almost three-quarters are unknown. 

Better training in birth-defects epidemiology 
is urgently needed, in particular in developing 
countries, says Kar. Such research is difficult, 
requiring population-scale surveillance regis- 
tries, and often relies on questionnaires that ask 
mothers of children with congenital abnormali- 
ties to try to recall past exposures — a process 
susceptible to inaccuracies. 

To improve matters, pan-European and US 
birth-defect registries are increasingly trying to 
match pregnancy outcomes with vast databases 
of histories of prescribed drugs, local water- 
and air-pollution levels, and other factors. 
Prescription histories are especially important 
because pregnant women are usually excluded 
from clinical trials, and so little may be known 
about the safety of medicines for fetuses. 

But many poorer countries lack even basic 
surveillance. In the case of neural-tube defects 
suchas spina bifida, for example, a global review 
published in April found that 120 of the WHO's 
194 member states had no prevalence data. 
“Registries are urgently required,’ says Kar. m 


7 JULY 2016 | VOL 535 | NATURE | 17 


IN FOCUS 


| POLICY | 

Thumbs down 
for ‘Common 
Rule’ revisions 


Panel nixes US government 
changes to research ethics. 


BY SARA REARDON 


he US government's proposed overhaul 

| of regulations that govern research with 

human participants is flawed and should 

be withdrawn, according to a review by the 

US National Academies of Sciences, Engineer- 
ing, and Medicine. 

The regulations — known collectively as the 
Common Rule — address ethical issues such 
as informed consent and storage of study par- 
ticipants’ biological specimens. In their 29 June 
report, the academies said that the government's 
suggested changes are “marred by omissions 
and a lack of clarity” and would slow research 
while doing little to improve protections for 
patients (see go.nature.com/29afkwd). Instead, 
the panel recommends that an independent 
commission craft new rules for such research. 

“This is a total smack-down, says Ellen 
Wright Clayton, a bioethicist and lawyer at 
Vanderbilt University in Nashville, Tennessee, 
of the academies’ report. 

The Common Rule, which was introduced in 
1991, seeks to ensure that research with humans 
is ethical by minimizing patient harm and 
maximizing the benefit to society. Over time, 
achieving these goals has become more complex 
because of technological advances such as the 
rise of DNA identification, which can make it 
harder to maintain patient privacy. 

The reforms, proposed in September by the 
US Department of Health and Human Services 
(HHS), attempt to address such emerging con- 
cerns. For instance, the HHS proposal would 
require participants’ consent to use stored 
samples, such as blood or tissue, for future 
research. Even if samples are anonymized, 
the HHS says, it is fairly simple to re-identify 
people on the basis of their DNA. 

But the US academies’ panel says that the 
proposed consent requirements would slow 
research unnecessarily, because little harm is 
likely to come to a person asa result of the use 
of stored biospecimens. And if the specimens 
are de-identified, the extra consent forms 
themselves would further link the specimens 
to the person’s name and therefore increase the 
risk that the person would be identified. 

An HHS spokesperson says that the govern- 
ment is still mulling the new report and more 
than 2,000 public comments on its reforms. m 


18 | NATURE | VOL 535 | 7 JULY 2016 


Most fires in the Amazon are started by landowners trying to clear fields and forests for cultivation. 


Amazon set for 
record fire season 


Warm oceans presage intense blazes in rainforest. 


BY JEFF TOLLEFSON 


he Amazon is ready to burn. After 
Ts unusually dry rainy season, the 

southern section of the rainforest is 
heading into winter with the largest moisture 
deficit since 1998. This has set the stage for 
an unusually intense fire season, according 
to a forecast issued on 29 June that is based 
on sea-surface temperature trends in the 
Atlantic and Pacific oceans. 

“The region is primed to have record fire 
activity,’ says forecast co-author Douglas 
Morton, a remote-sensing expert at NASA's 
Goddard Space Flight Center in Greenbelt, 
Maryland. More broadly, a team led by Mor- 
ton and James Randerson, a biologist at the 
University of California, Irvine, says that 
it can predict fire risk across much of the 
globe — based in part on the influence of 
the weather pattern El Nifio and its coun- 
terpart, La Nifia. 

The Amazon burn predictions stem from 
the epic El Nifio weather event that emerged 
last year. El Nifios warm the tropical Pacific 
Ocean, which tends to reduce rainfall during 
the rainy season, and the warmer tempera- 
tures in the tropical Atlantic Ocean can sup- 
press rains during the dry season. 

The El Nifio that emerged last year also 


© 2016 Macmillan Publishers Limited. All rights reserved. 


helped to spawn devastating forest fires in 
Indonesia, the researchers say. Their work 
reveals that sea-surface temperatures in 
the Atlantic and Indian oceans foreshadow 
fire trends in Central America, Africa and 
some boreal forests in Earth's high northern 
latitudes. 

In each case, Morton and Randerson say, 
ocean conditions can provide a hint of pre- 
cipitation trends in key forested areas on 
land several months in advance. “All of these 
processes are contributing to both the build- 
up of fuels and the moisture level of those 
fuels going into the dry season,’ Randerson 
says. “That's what leads to a predictability in 
global fire regimes.” 


FORECASTING VULNERABILITY 

Other teams are looking to include fire risk 
in short-term and seasonal weather forecasts 
by incorporating independent fire models. 
These models attempt to account for factors 
such as vegetation type and the likelihood 
of lightning strikes or agricultural fires. 
Eventually, such forecasting systems could 
integrate more complex phenomenon such 
as the dynamics of vegetation growth, the 
way that fire tends to propagate across a 
landscape and the gases and particles that 
are emitted during a fire, says Allan Spessa, a 


> fire modeller at the Open University in Milton 
& Keynes, UK. 

The European Centre for Medium-Range 
Weather Forecasts in Reading, UK, plans to 
$ soon make public its prototype system to fore- 
cast fire risk about six weeks in advance, and 
the centre’s modellers are working to include 
fire risk in their seasonal forecasts. Florian 
Pappenberger, who heads the centre’s work on 
extreme-weather forecasting, says that the sta- 
tistical approach used by Morton and Rander- 
son is solid and can serve as an independent 
check on model forecasts, which come with 
their own uncertainties. Forecasts for water 
availability in rivers, reservoirs and agricul- 
tural systems operate in such a manner today. 

“I don’t think one method replaces the 
other, he says. “I expect that merging both 
will be quite beneficial” 

However, whether forests actually go up in 
smoke depends on a host of factors, including 
law-enforcement and fire-suppression efforts 
that vary from region to region. For instance, 
almost all fires in the Amazon are started by 
landowners clearing fields and forests for 


RIO TAMA/GETT 


cultivation and livestock. But once the humid- 
ity drops and the vegetation dries out, those 
agricultural fires can run wild. 


READY TO BURN 

The likelihood that this will happen increases 
as the dry season wears on, but scientists can 
already see El Nifio’s impacts. Morton and 
Randerson’s team analysed rainfall measure- 
ments from gauges 


“All of these and satellites dur- 
processes are ing the rainy sea- 
contributing son, and used data 
to both the from NASA's Grav- 
building up of ity Recovery and 
fuels and the Climate Experiment 


moisture level.” | (GRACE) satellites to 
provide an estimate 
of the cumulative water storage on land — in 
soils, aquifers and rivers — going into the dry 
season. Randerson says that the situation in 
the Amazon is worse than it was during the 
major droughts of 2005 and 2010 and on par 
with 1998, after the last major El Nifo. 


As well as forecasting risk in the Amazon, 


IN FOCUS | NEWS 


Morton and Randerson are tracking and map- 
ping fires there using infrared measurements 
collected by the Moderate Resolution Imaging 
Spectroradiometer (MODIS) sensors aboard 
NASAs Terra satellite. The device has detected 
almost 12,500 fires in the Mato Grosso region 
of Brazil this year alone — making 2016 the 
third-worst year in the MODIS record, which 
stretches back to 2003. 

In the Amazon, the question now is whether 
Atlantic storm systems will bring much- 
needed relief during the dry season. Morton 
and Randerson have identified a link between 
Atlantic hurricanes and Amazon fires: when 
the tropical Atlantic is warm, cyclones are 
more likely to form, and those cyclones pull 
the rain bands that often flow into the Ama- 
zon northwards. The US National Oceanic 
and Atmospheric Administration's hurricane 
forecast currently calls for a neutral season, but 
the tropical Atlantic has been cooling, which 
bodes well for the Amazon. 

“If there were to be a shift in north Atlantic 
sea-surface temperatures, that could short cir- 
cuit this fire forecast; Morton says. m 


CubeSats queue up 
for deep-space rides 


Tiny craft face a wait to be propelled beyond Earth’s orbit. 


BY ELIZABETH GIBNEY 


( "0a — spacecraft built from 

10-centimetre-sided cubes, often with 

off-the-shelf parts — are already ubiq- 
uitous in near-Earth orbit, doing everything 
from Earth observation to studies of bacterial 
proteins in space. Now scientists are itching 
to send them farther afield, and more thana 
dozen deep-space CubeSats are in the pipeline. 

The cost — typically no more than 
US$10 million for an interplanetary mission — 
means that the mini-craft can take risks that a 
more costly venture could not. They can also 
work in swarms, which allows new kinds of 
experiments. CubeSats generally piggyback on 
the launch of other missions, and whereas trips 
to low-Earth orbit, such as the cargo ships that 
shuttle to the International Space Station, are 
relatively common, missions to other parts of 
the Solar System are much rarer. 

Lifts are so hard to come by that the first 
interplanetary CubeSat — NASA's twin 
INSPIRE mini-spacecraft, intended to test key 
technology for future missions — has been 
waiting for almost two years. “We still have to 


find a ride,’ says Anthony Freeman, who man- 
ages the Innovation Foundry at NASA's Jet 
Propulsion Laboratory in Pasadena, California. 

CubeSats were originally conceived as a 
teaching tool in 1999. Today, they carry out both 
commercial missions and near-space science. 
But deep space poses a much bigger challenge 
(see ‘Miniature explorers’). Their diminutive 
size cannot accommodate standard propulsion 
and long-range communications equipment, let 
alone complex scientific instruments. 

Engineers are starting to overcome these 
problems, says Roger Walker, who oversees 
CubeSat development at the European Space 
Agency (ESA). To solve the communications 
problem, ESAs first interplanetary CubeSats will 
talk to Earth through a mothership. CubeSats 
will take part in the joint ESA-NASA Asteroid 
Impact and Deflection Assessment (AIDA) 
mission, planned for 2020, where they will take 
on risky jobs such as up-close data collection as 
a larger probe plunges into an asteroid. 

NASAs planned mission to Europa, currently 
under development, would also use the 
mother-daughter model, deploying a fleet of 
CubeSats to make close fly-bys of the Jovian 


© 2016 Macmillan Publishers Limited. All rights reserved. 


moon. Scientists think that Europa could 
harbour life under its icy surface. 

Lone deep-space CubeSat missions are also 
on the horizon. NASA has developed a min- 
iature radio-communication system capable 
of talking directly to Earth from Mars and 
beyond. The agency will test the system on 
INSPIRE — which has a side-mission of map- 
ping interactions between Earth’s magnetic 
field and the solar wind — and on Mars Cube 
One (MarCO), twin communication satel- 
lites scheduled to fly on the InSight mission to 
Mars when it launches in 2018 after a two-year 
delay. NASA has also developed tiny, cold-gas 
firing thrusters for propulsion, and radiation- 
resistant electronics that can survive beyond 
the protection of Earth's magnetic field. 

Meanwhile, firms in Europe are developing 
high-efficiency ion engines, and a company in 
Rome called IMT is looking at ways to power 
such engines with deployable solar panels that 
can turn to constantly face the Sun. Together, all 
these technologies make solo CubeSats missions 
feasible, says Walker. 

Freeman predicts that more than a hundred 
CubeSats could be dispatched throughout 
the Solar System by the end of the next dec- 
ade — but only if they can get into space. He is 
calling on all space agencies to agree to carry 
at least one CubeSat on each major planetary 
mission. Walker agrees: “It would really stimu- 
late the area. Ultimately, that’s the main prob- 
lem to overcome for interplanetary CubeSats, 
alongside communications.” This would mean 
forging plans for a CubeSat tag-along early in 
the mission's design phase. 

To cope with the large number of CubeSat 
proposals, NASA also wants to see more > 


7 JULY 2016 | VOL 535 | NATURE | 19 


| NEWS IN FOCUS 


MINIATURE EXPLORERS 


| Previously limited to Earth orbit by their diminutive size, shoe-box-sized CubeSat spacecraft are now poised 
| to invade the rest of the Solar System, with missions planned to carry these craft as far as Jupiter. 


INSPIRE 


O98 Lunar 


Flashlight 


Venus mission 


MERCURY VENUS 


AIDA 
@. 


e 
DIDYMOS 


NEA Scout 
so 


Size and distance not to scale 


low-cost commercial launchers developed, to 
carry tens to hundreds of kilograms of payload, 
in contrast to the 5 tonnes typical of launchers 
designed for communications satellites. Free- 
man says that such smaller rockets could carry 
perhaps a few dozen 5-kg CubeSats to low- 
Earth orbit, or be adapted to include an upper 
stage that could take a single CubeSat to deep 
space. He hopes to use a similar method to send 
a free-flying probe to Venus, where it would 
skim through the planet's acidic atmosphere. 
CubeSats aimed for the Moon might get 


JUPITER 


Europa Multiple- « 
Flyby Mission j 


an easier ride. NASA's Space Launch System, 
a heavy-lift rocket designed to send people 
beyond Earth’s orbit, will carry 13 CubeSats 
and an uncrewed Orion capsule on its maiden 
launch in 2018. The cargo will include Lunar 
Flashlight, which will use a reflected beam of 
light to look for icy deposits in the Moon's dark 
craters, and Near-Earth Asteroid (NEA) Scout, 
designed to explore a nearby asteroid. 

ESA is developing a separate lunar approach. 
Together with Surrey Satellite Technology Ltd 
(SSTL) in Guildford, UK, and the Goonhilly 


Earth Station in Helston, UK, it is developing 
a system that could solve two problems: a com- 
mercial mothership that would provide trans- 
port to the Moon and a data relay for dozens of 
CubeSats, for a fee of around £5 million (US$6.6 
milion) per craft. Eventually, such a model could 
expand, says the SSTL's Christopher Saunders. 
“Essentially, we want to build a Solar-System 
internet,” he told the Interplanetary CubeSat 
workshop in Oxford in late May. 

According to Freeman, CubeSats will soon 
be able to carry instruments that would have 
seemed off-limits only a few years ago, such as 
high-resolution imagers and radar altimeters. 
Anda recent investigation by the US National 
Academies of Sciences, Engineering and Medi- 
cine of CubeSats’ potential concluded that the 
probes are capable of doing “fantastic science’, 
Thomas Zurbuchen, a space scientist at the Uni- 
versity of Michigan in Ann Arbor, said at the 
meeting. “Much ofit has yet to be imagined.” m 


CLARIFICATION 

In the News Feature ‘Mystery in the heavens’ 
(Nature 534, 610-612; 2016), the discussion 
of the initial radio burst meant to say that 
over the course of just a few milliseconds, the 
source’s output matched that of 500 million 
Suns in the same time period. 


© 2016 Macmillan Publishers Limited. All rights reserved. 


FEATURE 


BACK 
TO THE 
THESIS 


Late nights, typos, 
self-doubt and despair. 
Three leading scientists 

dust off their theses, and 
reflect on what the PhD 
was like for them. 


BY KERRI SMITH & NOAH BAKER 


rancis Collins shakes his head in bewilderment as he flicks 
through the pages of his thesis. “At this point it looks very much 
like another language,” he says, looking with puzzlement at 
page 71, which contains far more equations than text. The PhD 
was on theoretical quantum chemistry, and had “absolutely no 
practical application’, Collins says. Looking at it now, “it does 
feel a little bit like this was another person”. 

Collins was in his early 20s when he was studying for his doctorate 
at Yale University in New Haven, Connecticut, modelling how small 
groups of atoms interact. “A lot of what I did was pencil on paper, try- 
ing to solve really complicated calculus equations. It was a little lonely at 
times,’ he says. Then, about halfway through his studies, he decided to 
quit his PhD and transfer to medical school. He ended up finishing the 
thesis in his spare time. “I spent many nights and many weekends trying 
to get this written out,’ he says, with something of a grimace. “I made 


22 | NATURE | VOL 535 | 7 JULY 2016 


myself a schedule and tried to stick to it, with my little electric typewriter, 
banging away.” 

The writing machines have changed, but the slog is the same. Complet- 
ing a thesis is a huge undertaking for PhD students, and many struggle 
to get that far: only around 70% of UK students who embark on doctoral 
studies actually emerge with a PhD, and the rate is just 50% in the United 
States. Many of those who do finish move on to careers outside academia; 
even those who stay sometimes wish theyd spent more time writing 
papers — the currency of career progression — instead (see page 26). 

So what value does the thesis retain, and what lessons does completing 
one impart? To find out, Nature asked three prominent scientists to dig 
out their theses, thumb through the pages and reflect on what they — and 
the world — gained from them. What did they learn that could be of 
value to students who are writing up today? Their reflections, sometimes 
surprising, are recorded in three short films that accompany this article 
online (see go.nature.com/297qrah). 

Collins’s PhD was the start of a stellar career: he famously moved into 
biological research, identified the gene that causes cystic fibrosis, led the 
Human Genome Project to completion and now, more than 40 years after 
writing his thesis, directs the US National Institutes of Health. But that 
doesn't mean that his PhD changed the world. “Did it really add signifi- 
cantly to the knowledge the Universe contains?” he says. “Well, it would 
bea rather small contribution, to be sure? 

But like others who went ‘back to the thesis’ 
for Nature, he thinks that what matters was not 
so much the subject or results, but what he learnt 
about the process of research along the way. “I 
think the greatest beneficiary of my PhD was not 
the Universe,’ Collins says. “It was probably me.” 


Watch Collins, Seager 
and Frith talk about 
their theses at 


© 2016 Macmillan Publishers Limited. All rights reserved. 


CHRIS MADDALONI/NATURE 


FEATURE 


 eEery 


FRANCIS COLLINS 


SEMICLASSICAL THEORY OF VIBRATIONALLY INELASTIC 
SCATTERING, WITH APPLICATION TO H* AND H, (1974) 


his past students. He pulls it out and places a paternal hand on the 
thick, leather-bound book. “I think it turned out pretty well. It’s 
quite a hefty document,” he says. 

The road to that document started back in 1970, when Collins 
arrived in the lab of Jim Cross, a theoretical chemist at Yale. Cross 
remembers Collins as “a quiet, unassuming man, not particularly 
sophisticated culturally”. But, he says, “I quickly realized that he 
was one of the brightest and most broadly based students that I 
have ever met”. 

Collins was tasked with developing theoretical models to explain 
what happens when a proton is fired at a hydrogen molecule: how 
does the energy of the two bodies dissipate, and could the hydrogen be 
coaxed into another state? Day after day, he sat at his basement desk, 
tackling calculus equations and writing corresponding computer pro- 
grams in Fortran. He used a machine in the university computer centre 
to punch the programs onto cards, then waited until after 1 a.m., when 
electricity was cheaper, to feed the cards into the mainframe computer. 
“It did make me begin to wonder, OK, is this the right path for me?” 

It wasn’t — something Collins came to realize during an all-nighter 
about halfway through his studies. He was talking to a fellow graduate 
student, Jay Gralla, who was examining how molecules of RNA fold 
up into secondary structures. The broader aim was to understand the 
rules by which genetic information in RNA and DNA is used to build 
biological systems. Collins was blown away. “I was astounded that I had 
missed this whole thing about biology — that it was digital, it was an 
information system, it did have principles,” he says. “It was a revelation” 

Shortly afterwards, Collins decided to switch to medical school. 
“That was a wrenching time,” he says. He was drawn to explore biol- 
ogy and medicine, but he also had a growing family, financial strains 
and “all kinds of self-doubts”. He also didn’t know if hed actually 
done enough work to complete his PhD — but Cross told him to 
write it up anyway. Collins stayed behind in New Haven to write, 
while his wife and young daughter left for the family’s new home in 
North Carolina. He still couldn't get it done before his medical stud- 
ies started. By the time of his graduation ceremony, in May 1974, he 
was finishing his first year of medical school and expecting a second 
child. He didn't attend. 

Several years later, his medical training complete, Collins returned 
to Yale to work in a molecular-biology lab, and never looked back. 
The exactitude instilled in him by his PhD stayed with him. He had 
learned to assess a complex system, strip it down to its component 
parts and glean insights from it. “That's something that I do now in 
my lab,” he says. 

His thesis work isn’t in much demand. The model of colliding 
particles that Collins developed was a good match with others’ 
experimental findings, and made some useful approximations, but 
the work has been convincingly superseded by advances in processing 
power. “These days, a theoretical chemist wouldn't dream of limiting 
themselves by doing these approximations,’ he says. 

Reflecting on it now, Collins is glad that he took the risk of switching 
fields, and would encourage today’s PhD students to take chances too. 
Transitions in a career are “when you grow the fastest; they're when 
youre really alive”. And think big, he urges. “If you're going to study 
something, study something important. It might be risky, it might be 
hard, it might not work, but there are too many people spending their 
time on obvious next steps.” 


C ollins keeps his PhD ona low shelf in his office, next to those from 


COURTESY OF KATHERINE ALBEN 


7 JULY 2016 | VOL 535 | NATURE | 23 
© 2016 Macmillan Publishers Limited. All rights reserved. 


SARA SEAGER 


EXTRASOLAR GIANT PLANETS UNDER STRONG STELLAR 
IRRADIATION (1999) 


asked whether there are any mistakes in her thesis. “I definitely 

have at least one typo. I know where it is, unfortunately. I hate to 
talk about it’ She thinks for amoment, her thesis unopened on the desk 
before her. “Now that you mention it, I should probably go back and 
correct it with a pen” 

There is little else for Seager to regret about her thesis. She is now 
a planetary scientist at the Massachusetts Institute of Technology in 
Cambridge, and, unusually, her PhD helped to found a field. “It might 
have been one of the first — if not the first — PhD theses on exoplanets,” 
she says. 

In 1996, when Seager started her postgraduate studies at Harvard 
University in Cambridge, just halfa dozen planets had been spotted 
orbiting distant suns. They could be detected only indirectly, mostly by 
capturing the ‘wobble’ that an orbiting planet caused in the movement 
ofa star. And the signals were noisy — some astronomers didn't believe 
that exoplanets were real. 

Seager was encouraged to enter the field by her supervisor at Harvard, 


A flicker of embarrassment crosses Sara Seager’s face when she is 


UTAFRITH 


PATTERN DETECTION IN NORMAL AND AUTISTIC 
CHILDREN (1968) 


retrieves her thesis from a study in her Victorian house in subur- 
ban London. The book, bound in sky-blue cloth, nestles on a low 
shelf, right next to a science-fiction encyclopaedia. She dusts it off 
with a cloth and opens it to the typewritten title page. “It looks very 


66 | have not looked at this in decades,” declares Uta Frith as she 


24 | NATURE | VOL 535 | 7 JULY 2016 


Dimitar Sasselov, who was keen to take a different approach. Sasselov 
encouraged Seager to study the atmospheres of exoplanets to find ones 
that might harbour interesting chemistry or indicate life. This seemed 
unlikely to work when the planets themselves were so difficult to detect. 
“Tt was a big risk at the time: a non-tenured professor and a grad student. 
Despite the advice otherwise of colleagues in the department, we went 
ahead, Sasselov says. 

Seager built a theoretical model suggesting that it should be possible 
to see starlight bouncing off a planet that was orbiting its star closely, 
and that analysing that light would reveal a fingerprint of the planet’s 
chemical constituents, temperature and pressure’. Shortly afterwards, 
during her postdoc, she predicted that it should be possible to spot 
clouds in the atmosphere, and that one of the easiest elements to detect 
would be sodium’. 

It was tough going. She derived equations to represent the 
components of a planet’s atmosphere and then, after teaching herself 
to code, plugged them into the computer models she was building. Her 
hours were long and isolated, and she would often hit programming 
bugs that threatened to derail her work. Meanwhile, ex-students from 
her department were calling from Silicon Valley: their companies were 
seeking people like her. “I was far from committed to a career in science. 
I often thought of leaving,’ she says. 

Yet Seager “always expressed a certainty about what she was working 
on’, recalls David Charbonneau, a contemporary of hers at Harvard who 
now leads an astronomy group there, and was using Seager’s theoretical 
predictions to explain his observational results. He describes her as a 


charming and childish. That’s really my immediate impression. I did 
want a short and an interesting title.” 

The title is as brief as Frith’s PhD was: she had only two years of 
funding, starting in September 1966, and at the end of 1968 she duly 
turned in the thesis: 205 pages, typed up by a secretary from her 
handwritten manuscript. The bibliography is concise, just 10 gener- 
ously spaced pages. “So little was known about autism at the time 
that this was the extent of the references I found,’ she says. Today, the 
developmental disorder is the subject of several thousand publica- 
tions each year. 

Frith came to London from her native Germany in 1964 to attend 
a course in abnormal psychology at the Institute of Psychiatry. 
There, for the first time, she met children with autism, and was 
“completely fascinated. I still am,” she says. She also met her future 
supervisors, psychologists Beate Hermelin and Neil O'Connor. At 
that time, autism spectrum disorders were poorly understood and 
carried a stigma. Those diagnosed were usually only the severe 
cases, children with profound intellectual and linguistic difficulties. 
The mainstream view in psychiatry was that autism was a product 
of a child’s upbringing and environment and that distant, unloving 
parents — particularly mothers — were to blame. 

Frith refused to subscribe to that view. “I was always struck, when 
I met the parents of these children, how little they corresponded to 
what was told about them in the literature,” she says. The question 
that interested Frith was whether the children might process infor- 
mation differently from other kids. To investigate this, she showed 
children a simple box containing green and yellow counters that were 
arranged in a specific pattern. She then covered up the box and asked 
the child to build the sequence from memory. 

She often travelled to hospitals to test children with autism, as well 
as to nurseries and schools to assess children in her control group. 
She plugged the data into mechanical calculators that were “very, very 
noisy” and then took the better part of a day to perform the statistical 
analysis. 

Frith turns to the later pages of her thesis to remind herself of what 
she found, well aware of how dated — even naive — it might sound. 
“Tm a little bit afraid of this now. What nonsense can it be?” She 


© 2016 Macmillan Publishers Limited. All rights reserved. 


NOAH BAKER/NATURE 


Wy 
Ss 
g 
= 
= 
(Be 
ira 
x 
= 
a 
ae, 
<= 
iS) 
iz 


COURTESY OF UTA FRITH 


fierce intellectual and recalls how annoying she found any imperfec- 
tion. “She would get frustrated if the data weren't as unambiguous as 
she would have liked” 

Seager says that the day she got her computer code to work “was 
one of the defining moments of my entire life”. And once her work 
was finished, she didn’t have to wait long for her predictions to be 
tested: in 2002, astronomers including Charbonneau detected the first 


Uta Frith and her 
supervisor Neil 
O’Connor, in 1971. 


showed that children with and without autism 
both made errors in about 25% of the trials, but 
that they made different mistakes. Children in 
the control group tended to follow the pattern 
too strongly — perhaps placing three green counters together instead 
of two. Those with autism, however, placed the counters in their own 
simple pattern, such as green, yellow, green, yellow. Frith proposed 
that these children impose very strict patterns on the outside world, 
too, and this idea seemed to correlate with the behaviours that clini- 
cians at the time considered characteristic of the condition — obses- 
sions with particular objects, for example, or disliking change. 

Frith saw logic in the children’s responses, and felt that they were 
not necessarily inferior to those of others. “It is presumptuous to 
think that those patterns imposed by autistic children are any worse 
than the patterns I have imposed on the data,’ the concluding para- 
graph of her thesis reads. “Well, that’s quite philosophical,’ she says, 
in modest delight. 

Frith is aware that she was studying at a golden time. Psychology 
was thriving in the United Kingdom; she had the undivided atten- 
tion of two supervisors; and, just as she was coming to the end of her 
PhD, she was offered a full-time job at a Medical Research Council 
(MRC) unit where one of her supervisors had just been appointed 
director. “I was just so fortunate,” she says. The post led to a 50-year 
career with the MRC and University College London, during which 
Frith showed that children with autism have deficits in their ‘theory 
of mind; the cognitive capacity to understand that others have their 
own beliefs and ideas. This was an important concept that was “just 
emerging in primate work” and that she adapted to studies of autism, 
says Ami Klin, who directs the Marcus Autism Centre at Emory Uni- 
versity in Atlanta, Georgia, and whose 1998 PhD was co-supervised 
by Frith. “She was always extraordinarily open-minded, patient, 
supportive,” he says. 

Frith knows that today’s PhD students have a much tougher time: fund- 
ing is tight and academic jobs scarce. But she remains a fan of the PhD 
as an apprenticeship in research. She learned from scratch how to for- 
mulate hypotheses, design experiments and analyse data. “It does mean 
doing what we might call slave labour for some of the time, but you learn 


FEATURE | NEWS. 


exo-atmosphere’, and found that it contained the sodium signature, 
albeit at a slightly lower level than Seager had predicted. Since then, the 
field has flourished: 3,285 exoplanets have now been confirmed, and the 
study of their atmospheres has bloomed. The material in Seager’s PhD 
has been used by astronomers to request time on the Hubble Space Tel- 
escope, Keck observatory and other instruments. And although Seager 
cant erase the sole typo from her thesis, she points out that the papers 
she published from it are free of mistakes. 

Did Seager enjoy her PhD? “Unfortunately, I think the answer might 
be no.’ But she does have fond memories of the time she spent writing 
up her research. “I remember when I was finishing it, I didn’t go to any 
other talks, I didn't really read the news, it was just put the blinkers on 
and get the job done” She found great satisfaction in devoting herself to 
a single task, and relished the clarity of thinking that afforded her. “The 
world goes away’, she says. “And so when youre in that zone, actually 
youre happy.’ 

Seager now tries to make sure that those in her lab have the space to 
think too. “T do let the students spin their wheels. They have to, or they 
wont find their own way.’ And if she could give advice to her younger 
self, it would be simply: “Hang in there” 

As for the thesis itself — a slim, red volume with gold lettering — it’s 
not something she feels sentimental about. “I’ve met people who, they 
cry when they give away their kids’ baby clothes, but I was never one 
of those — and I think I felt the same way about the thesis.” She's more 
inclined to look forward. “In exoplanets, the best planet, the best dis- 
covery, is the next discovery.’ 


through that, and you can see what it feels like to be a scientist,” she says. 

She admits that her thesis is a product of a different era — “I'm quite 
sure it would not meet the requirements now” — and she is willing to 
bet that there are mistakes in the text. “But who knows? I haven't read 
it. Why should I? There’s so much else more interesting to read.” = 


Kerri Smith and Noah Baker are multimedia editors for Nature in 
London. 


1. Seager, S. & Sasselov, D. D. Astrophys. J. 502, L157-L161 (1998). 

2. Seager, S. & Sasselov, D. D. Astrophys. J. 537, 916-921 (2000). 

3. Charbonneau, D., Brown, T. M., Noyes, R. W. & Gilliland, R. L. Astrophys. J. 568, 
377-384 (2002). 


7 JULY 2016 | VOL 535 | NATURE | 25 


© 2016 Macmillan Publishers Limited. All rights reserved. 


n the morning of Tom Marshall’s 

PhD defence, he put on the suit he 

had bought for the occasion and climbed onto the 

stage in front of a 50-strong audience, including his 

parents and 6 examiners. He gave a 15-minute-long 

presentation, then faced an hour of cross-examination about his past 

5 years of neuroscience research at the Donders Institute for Brain, Cog- 

nition and Behaviour in Nijmegen, the Netherlands. A lot was at stake: 

this oral examination would determine whether he passed or failed. “At 

the one-hour mark someone came in, banged a stick on the floor and 

said ‘hora est,” says Marshall — the ceremonial call that his time was up. 

“But I couldn't. I had enjoyed the whole experience far too much, and 
ended up talking for a few extra minutes.” 

Marshall’s elaborate, public PhD assessment is very different from 

26 | NATURE | VOL 535 


2 IL 220116 


BY JULIE GOULD 


that faced by Kelsie Long, an Earth-sciences PhD 
candidate at the Australian National University 
(ANU) in Canberra. Her PhD will be assessed solely on her written the- 
sis, which will be mailed off to examiners and returned with comments. 
She will do a public presentation of her work later this year, but it won't 
affect her final result. “It almost feels like a rite of passage,’ she says. 
PhDs are assessed in very different ways around the world. Almost all 
involve a written thesis, but those come in many forms. In the United 
Kingdom, they are usually monographs, long explanations of a student’s 
work; in Scandinavia, science students typically top-and-tail a series of 
their publications. The accompanying oral examination — also called 
a viva voce or defence — can bea public lecture, a private discussion or 
not happen at all. There is wide variation across disciplines and from one 
institution to the next. “Tt isa complicated world in doctoral education. 


© 2016 Macmillan Publishers Limited. All rights reserved. 


ILLUSTRATION BY OLIVER MUNDAY 


One format does not fit all,” says Maresi Nerad, founding director of 
the Center for Innovation and Research in Graduate Education at the 
University of Washington in Seattle. 

This isn’t necessarily a problem in itself, but some researchers worry 
that the decades-old doctoral assessment system is showing strain. Time- 
pressured examiners sometimes lack training and preparation for PhD 
assessments, which can lead to lack of rigour. “Two or three examiners 
come together to go through the thesis in a perfunctory way. They tick the 
boxes, everyone is happy, and then a PhD walks away,’ says Jeremy Farrar, 
director of the biomedical research charity the Wellcome Trust in London. 

Farrar, like other scientists, suspects that 
the PhD assessment is not keeping up with 
the times. Single-author tomes seem out- 
dated when much of research has become 
a multidisciplinary, team endeavour. 
Research is becoming more open, but PhD 
assessments can lack transparency: vivas are 
sometimes held behind closed doors. Some 
PhD theses languish, little-used, on office 
shelves or in archives. “We're seeing some 
students who are still submitting paper 
theses to us — they don't have electronic 
theses yet,’ says Austin McLean, director 
of scholarly communication and disserta- 
tion publishing at ProQuest in Ann Arbor, 
Michigan, which has the largest database of 
PhD theses in the world. What's more, little 
attention is given in the PhD assessment to 
soft skills such as management, entrepre- 
neurship and teamwork, even though these 
are an essential part of life beyond the PhD, 
and students are increasingly leading that life outside academia. “The 
assessment of the PhD hasn't been updated to fit the modern definition 
ofa PhD? Farrar says. 

“There are alot of pressures to make changes to the thesis,” says Suzanne 
Ortega, president of the Council of Graduate Schools in Washington DC, 
one of a number of groups discussing the issue. The council organized 
a workshop in January this year called Future of the PhD Dissertation, 
and in March, the Australian Council of Learned Academies (ACOLA) 
in Melbourne examined changes to the thesis as part of a review on 
researcher training. Some scientists and education experts welcome 
the attention. “I don't think the current model for thesis examination is 
ideal, but there are positive movements towards changing it,” says Inger 
Mewburn, director of research training at ANU and editor of the blog 
The Thesis Whisperer, which is dedicated to those completing a thesis. 


PASSING THE TEST 

Academics agree about one thing regarding the PhD assessment — its 
aim. The traditional goal is to demonstrate the candidate’s ability to con- 
duct independent research on a novel concept and to communicate the 
results in an accessible way. Where the academics differ is on how best 
to achieve that goal. 

Shirley Tilghman, a molecular biologist and former president of 
Princeton University in New Jersey, sees merit in the monograph form 
of the thesis. It demonstrates scholarly ability by requiring students to 
“frame the historical context of a problem, describe in detail the purpose 
and execution and then come toa credible conclusion’ she says. 

But should the thesis include academic publications, too? That’s the 
norm at the Karolinska Institute in Stockholm, Sweden, where most theses 
are a compilation of the student’s original papers, along with a relatively 
short discussion, perhaps 50 pages long. The rationale is that publishing 
should be part of training because it better equips students for academic 
life and securing jobs. 

Some students who complete a monograph end up wishing that 
they had spent more time on writing papers. James Lewis successfully 
defended his physics PhD at Imperial College London in October 2015, 


“<The thicker 
my PhD, 
the better’ 


has become 
a myth in 
the PhD 
community.” 


FEATURE | NEWS 


but he thinks that his one published paper landed him his postdoc at 
NASAs Goddard Space Flight Center in Greenbelt, Maryland. “The job 
market for postdoc positions is very competitive,’ he says, “so if you can 
get a paper published during your PhD then you're helping yourself” 
While he waits to start, Lewis is spending his days writing papers based on 
his research. “Pm wondering: would it not have been better to write these 
instead of the thesis, which took me five months to write?” 

But others argue that the pressure to publish could rob PhD students 
of valuable parts of their studies, such as the time to shape their research 
path and to think creatively and independently (see page 22). “The PhD 
might become driven by papers only,’ says 
Farrar. “Students might end up spending 
their time focusing only on what papers they 
can produce, then staple them together with 
asummary and they're done — adding to the 
sense that the whole scientific enterprise is 
a paper factory rather than an exploration” 

Long is working at ANU towards a thesis- 
by-publication: she’s written and submitted 
one paper and has started on a second. But 
she’s struggling. “I am finding this one much 
harder to write, mostly because it isn't as new 
or exciting as the previous one;’ she says. 
What's more, her strategy depends on things 
at least partly outside her control — on her 
PhD generating enough complete studies for 
publication and ona reasonably timely peer- 
review process. 

Completed PhD theses are typically 
stored in university libraries — but that 
doesn’t mean that they are read or used. 
Some 60% of submissions to the ProQuest database fall under the cat- 
egory of science, technology or mathematics, but they are the ones that 
are accessed least. “We think this is because the communication is more 
journal-focused,” says McLean. Scientists do tend to keep a copy of their 
theses in their office or lab for use by students and colleagues. Neil Cur- 
son, a physicist at the London Centre for Nanotechnology, says that his 
PhD, written more than 20 years ago, is still consulted by his students 
when they come into his lab. Many theses, however, end up collecting 
dust. 


VIVALA VIVA 

Whatever form the thesis takes, it has to be assessed — in most countries, 
by a panel of experts, and often involving an oral exam. But the viva 
“doesn't have the same level of consistency as the written form of exami- 
nation’, says Allyson Holbrook, an education researcher at Australia’s 
University of Newcastle. In Israel, the viva is optional and very few stu- 
dents choose to have one; in the Netherlands, it is formal and ceremonial; 
in the United Kingdom, it’s typically a private affair with two or three 
examiners; and in Australia, it’s hardly performed at all. “One hundred 
per cent of the doctoral examination is about the thesis here,’ says Hol- 
brook. That's largely because, historically, there weren't enough experts 
in the country to examine the work in person and it was costly to fly 
them in, she says. 

Holbrook and her research team published a study last year that 
compared the assessment methods used in Australia with those in New 
Zealand and the United Kingdom (T. Lovat et al. Higher Ed. Rev. 47, 
5-23; 2015). They concluded that doing an oral defence rarely changed 
the result, and that the thesis itself was the “determinative step” of pass- 
ing. The review on Australian research training published in March 
didn't support adding a viva either, but it did recommend a move towards 
more continuous assessment of a student, rather than waiting until the 
end of the training. 

Some researchers see problems with the viva. It’s not uncommon for 
nerves to get the better ofa student, and for them to freeze in front of their 
audience, however small it is. Examiners could worsen the situation by 


7 JULY 2016 | VOL 535 | NATURE | 27 


© 2016 Macmillan Publishers Limited. All rights reserved. 


| NEWS FEATURE 


asking very difficult questions, says David Bogle, a chemical engineer 
at University College London. “There are cases where undue pressure 
is placed on the candidate by the examiners. This shouldn't be allowed” 


TRIAL BY ERROR 

Most researchers don't support a global standard for the PhD assessment. 
A one-size-fits-all approach would be impossible to implement, they say, 
and the type of assessment — be it continuous appraisal, written thesis or 
oral exam — should depend on discipline, project, student, supervisor 
and institution. “If you take away the variability in assessment and form 
of the thesis then you lose all creativity and innovation from the PhD,’ 
Nerad says. 

But many feel that the system could be improved — by making the 
thesis shorter, for one. Data from ProQuest, which stores 4 million the- 
ses, show that the average length 
of biology, chemistry and phys- 
ics PhDs soared to nearly 200 
pages between 1945 and 1990. 
That could be because students 
are analysing more complex 
questions, performing longer 
literature reviews and using 
increasingly complicated meth- 
ods that require lengthier expla- 
nations (see ‘The expanding 
thesis’). “It’s unnecessary to have 
such a long thesis,” says Farrar, 
who recently assessed one such 
tome. “The thicker my PhD, the 
better’ has become a myth in the 
PhD community, and is taking it 
down the wrong direction” 

Farrar says that a slimmed- 
down document would be more 
appropriate. That could follow 
the concise format of a research 
paper, and include a review of 
the field, then short chapters on 
methods, analysis and discus- 
sion. “It would be more succinct 
and focused. And the examiners 
will probably read it all” 

That isn’t necessarily the case 
now. Examiners have to find 
time to review theses in between 
research, teaching, grant-writing and many other demands. “Some- 
thing has to give, and what gives is the amount of time spent on any of 
those individual tasks,” says Farrar. That means an examiner might skim 
through years of a PhD student's work in just a couple of hours. “I think 
we owe it to the students to examine them properly and help prepare them 
for their future careers,’ he says. 


Biology 
Chemistry 
Physics 
= Average 
200 


ay 
oa 
fo} 


n 
® 
n 
®o 
i 
6 
a 
ie 
et 
- 
je} 
aan 
= 
= 
9 
3° 
cd) 
0 
So) 
(ei, 
ro) 
0 
© 
ne 
co) 
= 
x= 


TO 5OR 9601970) 


THE MODERN THESIS 
One way to better reflect the team-based nature of science would be to 
write a joint thesis, an approach that has been used in arts and humani- 
ties graduate education in the past. However, this can make it difficult 
to assign credit. “If you have worked on a collaborative dissertation, a 
potential employer might struggle to see whether you really are an inde- 
pendent thinker or could you read a lead a research project; says Ortega. 
There is another matter to wrestle with — the fact that half of science 
PhD graduates in the United States are choosing careers outside of 
academia, according to the National Science Foundation’s 2014 Survey 
of Earned Doctorates. “Under those conditions, the standard assess- 
ment should include the skills in what they'll need when going on to 
future careers,” says Michael Teitelbaum, labour economist at Harvard 
Law School. 


28 | NATURE | VOL 535 | 7 JULY 2016 


THE EXPANDING THESIS 


The average length of science PhD theses stored by ProQuest 
has risen in recent decades, perhaps because the complex 
analyses and methods require more space to explain. 


1980 


Increasingly, institutions offer courses to PhD students in skills such 
as teamwork, management and research ethics, but these skills aren't 
usually assessed formally. The viva would be one opportunity to do so, 
perhaps by seeing how students react to various scenarios. Alternatively, 
as the ACOLA review suggested, PhD candidates could accrue credits in 
transferable skills through professional-development activities that are 
recorded ina portfolio. “You can’t just assume that if you throw them into 
an environment they will meaningfully learn from that environment, 
says psychologist Michael Mumford, a director of the Center for Applied 
Social Research at the University of Oklahoma in Norman. “We need 
exams that ask students to deal with both real-world problems as well as 
ambiguous academic problems.” 

Farrar thinks that a change in emphasis could help. Rather than 
thinking of the thesis and viva as an exam, it should be viewed as the 
culmination of a long project. 
“You need to look at the PhD in 
the context of those four years of 
research, not just as revision for 
one big test.” 

Mewburn stresses that what- 
ever form the assessment takes, 
it should focus more on the 
individual than on their work. 
“My preference is to assess the 
researcher,” she says, “but we 
havent developed the tools and 
curriculum to do that? 


SO 


FEW FAILURES 

It’s difficult to find figures on how 
many students fail their PhD if 
they get to the point of submit- 
ting a thesis but, anecdotally, 
scientists say that few flunk it 
outright. More often, students are 
sent away with minor or major 
corrections that have to be com- 
pleted before the PhD is awarded. 

There are theories that few 
students fail because universi- 
ties want to keep their number of 
graduates high for the rankings. 
But most researchers dispute this, 
and point to other reasons. One 
is that weak students are likely 
to have dropped out before they reach the final assessment. Further- 
more, supervisors and the supporting institutions typically work hard — 
through regular reviews and assessments — to make sure that a candidate 
and project are ofa sufficient standard before the thesis is submitted. “You 
haven't done your due diligence as a university if a student is getting to a 
stage where they are sending out theses that are going to fail,” says Simon 
Hay, a global-health researcher at the University of Washington. 

Nerad sees no need to reform the final PhD assessment. For her, the 
problem lies with the variability of graduate education as a whole. “Now 
that research is becoming more globalized, the PhD needs to be too.” That 
process is under way, Nerad says: the pressures of economic globalization, 
international policies and national drives to house world-class universities 
have led to a more standardized PhD experience across the world. 

During her tenure as Princeton's president, Tilghman was often asked if 
there was a perfect way to assess a PhD course. Not many liked her answer 
— that she could only really evaluate a student at the 25-year reunion. 
“In the end, the only way you can assess it is whether the graduates of the 
programme become successful scientists. If they do, you've done a good 
job. If they haven't, you haven't” m 


Some scientists 
would like to see 
shorter theses that 
are easier to write, 
read and examine. 


1990 2000 2010 2015 


Julie Gould is an editor for Naturejobs. 


© 2016 Macmillan Publishers Limited. All rights reserved. 


SOURCE: PROQUEST 


KEVIN FRAYER/GETTY 


(Tee 
Lessons from &% 
30 years of trying to SZ 
improve peer review p.31 


= {Brown to the US National 
el Park Service p.34 


From ‘Capability’ 


Fetal-tissue research 
is key to infant health, say 
subpoenaed researchers p.37 


COMMENT 


Physicists, 
biologists and chemists, why 
so snooty? p.37 


The greater availability of data on air quality has gripped the public, especially in heavily polluted cities such as Beijing. 


Validate personal 
air-pollution sensors 


Alastair Lewis and Peter Edwards call on researchers to test the accuracy of low-cost 
monitoring devices before regulators are flooded with questionable air-quality data. 


r | he public is increasingly aware of 
the health and economic costs of air 
pollution. Poor air quality is linked 

to over three million deaths each year, and 

96% of people in large cities are exposed to 

pollutant levels that are above recommended 

limits’. The costs of urban air pollution 
amount to 2% of gross domestic product in 
developed countries and 5% in developing 
countries (see go.nature.com/28qv0ka). 
Media attention and the increasing 
availability of data are reinvigorating efforts 


in many countries to tackle air pollution, 
driven as much by local and national politics 
as by science. 

In response, start-up companies are 
rushing to produce cheap air-monitoring 
sensors, costing hundreds rather than tens 
of thousands of pounds. Such devices bridge 
gaps between sparse government measure- 
ments and individuals’ wishes to track their 
personal exposures’. Ina wealthy city, a sin- 
gle official monitoring station might repre- 
sent 100,000 people; in emerging economies, 


© 2016 Macmillan Publishers Limited. All rights reserved. 


one instrument covers millions of citizens. 
Although personal sensors have not yet 
achieved their market potential, applications 
are promising. Portable sensors are becoming 
amainstay of health research by showing peo- 
ple’s exposure to environmental factors rang- 
ing from noise to particulate matter™*. Live 
pollution data can be integrated into traffic- 
management systems to track the impacts of 
policies such as low-emissions zones. Afford- 
able air-quality devices are being produced for 
developing countries. For example, the 


7 JULY 2016 | VOL 535 | NATURE | 29 


> United Nations Environment Programme 
launched a device in 2015 at a modest cost 
(around US$1,500) to measure particulates, 
sulfur and nitrogen oxides as part of a govern- 
ment pilot scheme in Kenya. 

All this excitement presumes that these 
low-cost air-pollution sensors are fit for 
purpose. For regulatory applications, gov- 
ernments and scientists use the most accu- 
rate, but expensive, detectors. And although 
the interpretation of the data is a subject of 
lively debate, the quality of readings is rarely 
questioned. By contrast, few of these low- 
cost devices have been rigorously tested 
and most researchers view the buzz as being 
beyond the serious business of academia. 

The research and regulatory communi- 
ties are behind the curve. The penetration of 
these devices into the public domain, generat- 
ing large volumes of untested and question- 
able data available to all, is inevitable and will 
increasingly become a headache for those 
who are responsible for managing air qual- 
ity. And opportunities beckon. Atmospheric 
chemists must engage so that these technolo- 
gies can realize their huge potential. 


COMPLEX BLEND 

Measuring atmospheric pollutants is 
challenging. Most gaseous pollutants, such 
as nitrogen dioxide (NO,) or ozone, occur 
at parts-per-billion levels in air and are 
blended with thousands of other compounds. 
Unburnt fuel, for example, contributes many 
different hydrocarbons to the urban atmos- 
pheric mix. Added to this are large and 
changeable amounts of water vapour and 
carbon dioxide, at temperatures anywhere 
between -30°C and 50°C. This is difficult 
analytical chemistry at the best of times. 

Atmospheric chemistry research has long 
been a hotbed of invention for detection tech- 
nologies and analysis methods. Ideas emerged 
mainly from universities, institutes and a few 
research-led companies, such as Aerodyne 
Research and Picarro in the United States 
and Ionicon in Austria. The fruits of this 
labour have been tested by peer review; there 
are entire journals devoted to atmospheric 
instruments. Fresh technologies must estab- 
lish credentials. The best ones are absorbed 
by a few early-adopter research groups. 
Over perhaps a decade, successful methods 
find their way into research use; a rare few 
make it into regulatory networks. Along the 
way come dozens of papers, international 
evaluations, comparison exercises, reference 
materials and best-practice guides. 

By contrast, most of the latest air-pollution 
sensors are developed by small- and medium- 
sized enterprises, backed by venture capital 
and crowdsourced funding. Many devices 
adapt off-the-shelf technologies. Peer review 
and academic evaluation may be bypassed. 
The public are the early adopters; research 
chemists and physicists are largely on the 


30 | NATURE | VOL 535 | 7 JULY 2016 


Asensor used to measure air quality in Kenya. 


sidelines. Academics’ funding is threatened 
by this commercial acceleration, because 
these devices mean that incremental research 
developments — such as the miniaturization 
of high-quality detectors, often based on 
optical absorption, particle counting or mass 
spectrometry — are less attractive to grant- 
ing agencies. Many of the processes used for 
cheap sensors, such as chemical interactions 
between gases and surfaces, are less well 
understood. 

The range of devices is wide. The cheapest, 
costing a few dollars each, use technologies 
that have been repurposed from hazard detec- 
tors, such as metal-oxide sensors that meas- 
ure oxidizable gases. For tens to hundreds of 
dollars, electrochemical or photoionization 
detection can notionally observe particular 
compounds or classes. In the $150-1,500 
band come miniaturized instruments, such 
as optical particle counters that can fit in your 
palm. In general, reducing cost inevitably 
reduces specificity or sensitivity, or both. 


KEEP TESTING 
Most commercial sensors target parameters 
that governments need to track, such as levels 
of particulate matter (PM) and NO,. To doa 
thorough job requires calibration of the tar- 
get compound and all other possible interfer- 
ences that might be present. City authorities 
and the public lack the technical means of 
checking these themselves, so must take the 
quality of the measurements on trust from the 
supplier. The US Environmental Protection 
Agency has created a technical framework 
for testing sensors in public use, benchmark- 
ing them against the most accurate monitors. 
But manufacturers might not engage with this 
process unless they are required to. 

The literature on real-world sensor 
performance is thin. Anecdotally, we have 
heard that leading research labs have tested 


© 2016 Macmillan Publishers Limited. All rights reserved. 


commercial sensors and found them want- 
ing. But because papers reporting nega- 
tive results have low priority, only a few 
studies have been published (see, for example, 
refs 5 and 6). These reveal stability and sensi- 
tivity issues, and show that the sensors react 
to other air pollutants and longer-lived gases 
such as CO, and hydrogen. They are also 
influenced by meteorological conditions such 
as humidity, temperature and wind speed. 

Simple sensors perform best when 
pollution levels are high and when the com- 
pound of interest swamps others — for exam- 
ple, sensors for nitric oxide (NO) and NO, 
seem to work well in locations that have heavy 
traffic and high pollution levels, where con- 
centrations of these gases approach the parts- 
per-million level. In more typical conditions, 
sensors might respond to other atmospheric 
species as well. Calibrations of cheap sensors 
performed in the lab and in the field can differ 
markedly’ , and most relationships observed 
in the field only apply to that location and for 
a limited time. 

Our research shows that the biggest head- 
aches are caused by interfering chemicals, 
such as CO, and H,, and by the irreproduc- 
ibility of measurements. Our real-world test 
of 20 identical ozone sensors on a roof founda 
difference of a factor of 6 between the highest 
and lowest measurements’. In other words, 
the variability of the responses was greater 
than that of the actual atmosphere. We tested 
amid-priced electrochemical sensor for NO, 
in real conditions for an atmospheric concen- 
tration of 40 micrograms per cubic metre (the 
European air-quality limit value). We found 
that roughly half of the signal from the sensor 
was from NO,, and that the rest came from 
the sensor's response to ambient CO,. The 
device was detecting changes in air pollution 
minute by minute, but not only changes in 
NO,. 


FAIR USE 

Does it matter that a sensor reports an indic- 
ative value or trend? It depends on how they 
are sold and used. Some cheap devices are 
advertised as being simply for raising aware- 
ness of pollution, and it might be expecting 
too much for them to report accurate val- 
ues. Others claim to give pollutant measures 
that can be compared against conventional 
monitors or official model forecasts. 

Until there is agreement on what degree of 
sensor accuracy is acceptable, we urge cau- 
tion. Their fitness for purpose should be dem- 
onstrated, particularly where they will have a 
role in decision-making — whether it is at a 
city, community or personal level. Although 
we do not wish to stifle innovation, sensors 
that claim to be able to measure ambient pol- 
lution levels could be required to undergo an 
independent testing regime, as is the case for 
instruments that are used in regulatory meas- 
urements. Some definition of measurement 


ALEXANDER IKAWAH 


uncertainty is needed, as is standard practice 
in other fields — even bathroom scales come 
with uncertainties printed on them. A mark 
should signify that the sensor meets a mini- 
mum quality standard 

If such a stamp of approval sounds 
bureaucratic, think of how the data might 
be used. People with asthma might use their 
local sensor data to make personal decisions 
on medication; an air-pollution sensor is not 
meant as a medical device, but its real-world 
application could make it function like one. 
Privately owned sensor data could trigger 
legal actions in areas that apparently exceed 
local air-quality standards. The economic 
and socially disruptive costs of closing roads 
or banning cars based on live sensor data 
would be huge. 


NEXT STEPS 
The academic air-pollution community must 
do the hard yards in the lab and field on cali- 
bration and testing. It must also find ways to 
overcome some measurement challenges. 
Researchers should take the lead on evalu- 
ating sensor performance, creating better 
devices and designing research applications 
that are suited to the quantified capabilities 
of sensors. 

More creativity is needed in experimental 
design. If the long-term performance of sen- 
sors is a problem, as is likely, then we need 


to design shorter-term experiments that 
can be performed reliably. For example, a 
fine-scale but qualitative measure of pol- 
lution might help to simulate the turbulent 
flows of pollution in street canyons or tree 
canopies over a few days. There might be 
experiments in which a fast-responding bulk 
sensor — one that measures the sum of many 
organic compounds, for example — might 
be able to track rapid temporal changes that 
add context to a slower but more quantitative 
instrument, such 


as a gas chromato- “Manufacturers 
graph or diffusion andregulators 
tube. Statisticaland need to define 
machine-learning how and where 
methods mightbe sensors can be 
developedtoenable used.” 

better extraction of 


signals from a mix of pollutants®. 

However, academics should not become 
gatekeepers or validation bodies. This is a 
job for manufacturers and regulators, who 
need to define how and where sensors can 
and cannot be used effectively. 

Governments must provide advice now to 
potential ‘professional users, such as in cities 
and regional environmental agencies. For 
sensors that might be used for public policy, 
health studies or any type of infrastructure 
control, independent testing and verification 
is essential, as is already being done through 


long-standing environment-agency com- 
mittees and national air-pollution schemes. 
Even sensors that are designed for entertain- 
ment or awareness-raising need appropriate 
labelling to define their capabilities. 

Well designed sensor experiments, that 
acknowledge the limitations of the tech- 
nologies as well as the strengths, have the 
potential to simultaneously advance basic 
science, monitor air pollution — and bring 
the public along. m 


Alastair Lewis is a science director at the 
National Centre for Atmospheric Science 
in Leeds, UK, and professor of atmospheric 
chemistry at the University of York, UK. 
Peter Edwards is a research fellow in 

the Wolfson Atmospheric Chemistry 
Laboratories at the University of York, UK. 
e-mails: ally.lewis@ncas.ac.uk; 
pete.edwards@york.ac.uk 


Lim, S. S. et al. Lancet 380, 2224-2260 (2012). 

. Kumar, P. et al. Environ. Int. 75, 199-205 (2015). 

. Piedrahita, R. et al. Atmos. Meas. Tech. 7, 

3325-3336 (2014). 

ieuwenhuijsen, M. J. et al. Environ. Sci. Technol. 

49, 2977-2982 (2015). 

ead, M. |. et al. Atmos. Environ. 70, 186-203 

(2013). 

. Kamionka, M., Breuil, P. & Pijolat, C. Sens. 

Actuators B Chem. 118, 323-327 (2006). 

7. Lewis, A. C. et al. Faraday Discuss. http://dx.doi. 

org/10.1039/C5FDO0201J (2015). 

8. De Vito, S., Piga, M., Martinotto, L. & Di Francia, G. 
Sens. Actuators B Chem. 143, 182-191 (2009). 


2 9 F ONE 


Make peer review scientific 


Thirty years on from the first congress on peer review, Drummond Rennie reflects on 
the improvements brought about by research into the process — and calls for more. 


eer review is touted as a demonstration 
p= the self-critical nature of science. 

But it isa human system. Everybody 
involved brings prejudices, misunder- 
standings and gaps in knowledge, so no 
one should be surprised that peer review is 
often biased and inefficient. It is occasion- 
ally corrupt, sometimes a charade, an open 
temptation to plagiarists. Even with the best 
of intentions, how and whether peer review 
identifies high-quality science is unknown. 
It is, in short, unscientific. 

A long time ago, scientists moved from 
alchemy to chemistry, from astrology to 
astronomy. But our reverence for peer 
review still often borders on mysticism. For 
the past three decades, I have advocated 
for research to improve peer review and 
thus the quality of the scientific literature. 
Here are some reflections on that winding, 
rocky path, and some thoughts about the 
road ahead. 


I trained as a physician, studying the 
pathophysiology of exposure to high 
altitudes. In 1977, I became deputy editor 
of The New England Journal of Medicine 
(NEJM), working with what I assumed was 
asmoothly oiled peer-review system. I found 
myself driving an enormous machine whose 
operation was sometimes interrupted by 
startling hiccups. The first big one occurred 
a year after I arrived. An author who had 
submitted a paper to our journal accused 
one of our reviewers, who worked at a com- 
peting lab, of plagiarizing parts of her paper. 
She sent us a manuscript that her lab chief 
had been sent to assess for another journal, 
one that I could see had been typed on the 
same typewriter that the reviewer had used 
to write his review. I was told to sort it out. 

This was more than a decade before 
a formal definition of research miscon- 
duct and systems for its investigation were 
established. Several careers fell apart. That 


© 2016 Macmillan Publishers Limited. All rights reserved. 


of the actual plagiarist, and also that of his 
chief, our reviewer, who was the senior 
co-author of the manuscript that contained 
the plagiarism. Tragically, our innocent sub- 
mitting author also gave up research when 
her accusations were rebuffed, and she was 
bullied and demeaned for her persistence 
and integrity. 

This slow-motion catastrophe angered 
me. How common was such incompetence, 
confusion and corruption? Did peer review 
root it out — or just lob it down the road? 
A few years later, revelations of fabricated 
data in scores of papers by US cardiologist 
John Darsee, in NEJM and other journals, 
showed that peer review was usually help- 
less in detecting gross fraud. More recently, 
the cases of Dutch psychologist Diederik 
Stapel and US-based cancer researcher 
Anil Potti underline how easily false data 
continue to get through the system. Even 
if peer review could not detect outright 


7 JULY 2016 | VOL 535 | NATURE | 31 


SELECTING GOOD SCIENCE 


Milestones in modern peer review and reporting. 


| 978 = 79 Revelations of scientific fraud at Yale 
and Harvard universities publicizes the issue. 


I 978 i 92 The Oxford Database of Perinatal Trials 
is set up by lain Chalmers. He later establishes the 
Cochrane Collaboration and its systematic analyses. 


1986 Studies demonstrate publication bias 
in clinical trials; it is caused by the failure of trial 
authors to submit results for publication. 


I 989 Regulations defining scientific misconduct 
and a procedure to address allegations are codified 
into US law. Peer review is revealed to be ineffective 
against misconduct. 


1989 The first Peer Review Congress held in 
Chicago, Illinois. It includes a trial of blinding 
reviewers to authors’ identities. 


| 993 The Cochrane Collaboration, founded to 
review published reports relevant to health, reveals 
inherent biases. 


| 996 The CONSORT statement on reporting 
clinical trials is released, with a checklist to assist 
authors and reviewers. 


I 999 The British Medical Journal adopts 
open peer review on the basis of evidence from 
randomized trials of the practice. 


2000 Es PRESENT Online-only journals rise in 


prominence along with new models of peer review. 


2004 Clinical-trial pre-registration is made a 
condition of publication. 


2006 The EQUATOR Network is founded to 
assemble reporting guidelines. 


201 0 ‘Beall’s list’ warns against ‘predatory’ 
journals with questionable peer review. 


201 4 iz PRESENT Groups (including ORCID, 
CASRAI, F1000 working group) are founded to 
support and credit reviewers. 


201 7 Eighth Peer Review Congress to be held in 
Chicago. 


32 | NATURE | VOL 535 | 7 JULY 2016 


fabrications, could it sniff out error in hon- 
est scientific work, I wondered? There had 
to be a way to find out. 


QUESTIONS ASKED 

In 1985, an influential commentary’ asserted 
that “the arbiters of rigor, quality, and inno- 
vation in scientific reports” did not “apply 
to their own work the standards they use 
in judging the work of others”. Ouch! Peer 
review had to be studied, it said, and the 
most urgent need was leadership within the 
scientific community. 

I had been working at The Journal of the 
American Medical Association (JAMA) since 
1983. The chief editor was interested in hold- 
ing a conference on peer review; I jumped 
at the chance. I insisted that all presenta- 
tions describe research — and then worried 
whether we would get a single abstract. 

The inaugural Peer Review Congress was 
held in a distinctly shabby hotel in Chicago, 
Illinois, in 1989. It was engaging and con- 
tentious: presenters studied the demography 
of reviewers at various journals, how often 
individuals conducted reviews, blinding, 
statistical reporting and much more. I was 
thrilled to see actual data. 

A distinguished editor in the audience 
took another view, excoriating presentation 
after presentation. Finally, Iain Chalmers 
(who later co-founded the Cochrane Col- 
laboration) stood and addressed him: “We 
have listened to your incessant criticisms 
of everyone who has gone to the trouble 
of obtaining data. What we have not heard 
from you is one single piece of evidence for 
your opinions.” There was loud applause, 
and the future of these congresses was 
assured. They have taken place every four 
years since — in much better hotels. 

Thanks to such research, we now know 
a great deal about the mechanics of peer 
review — the time taken to appraise papers, 
rates of disagreement between reviewers, the 
cost at certain journals, even the occurrence 
of misconduct during review. 

Research has brought clear improvement 
to the biased reporting of clinical trials. 
Randomized clinical trials cost millions 
of dollars, are rarely repeated, and greatly 
influence what treatments patients receive. 
My colleagues and I showed that most trial 
results in submitted manuscripts favoured 
the treatment tested, and this was reflected 
in the results that were published’. Other 
work revealed that more than 90% of the 
bias was due to authors failing to submit 
manuscripts that are unfavourable to the 
treatment, and that commercial sponsorship 
drove decisions not to submit*. Although any 
single trial might have been conducted well, 
the system was skewed. Publication bias 
made drugs look better than they were. 

This line of investigation provided evi- 
dence that convinced journals to require 


© 2016 Macmillan Publishers Limited. All rights reserved. 


that clinical trials be ‘pre-registered’ at 
inception. Compliance is still patchy, but 
journal editors now routinely check that 
trials were announced publicly (typically at 
ClincialTrials.gov) before results were col- 
lected. We can now expect that when drugs 
are found to cause serious harm during the 
trials, the existence of those trials will no 
longer be hidden from the world. 

Meta-research has revealed other sources 
of distortion. For instance, when trial reports 
fail to account for control patients or do not 
fully describe methods for randomization 
and blinding, they are also more likely to 
report exaggerated effects. 

Such observations led to new standards for 
reporting clinical trials. An early version of 
the guidelines was tested in JAMA and pro- 
duced a report that our readers found unread- 
able*. The next version of the guidelines, 
published in 1996 and called CONSORT 
(Consolidated Standards of Reporting Trials, 
of which I am a co-organizer), was much bet- 
ter accepted. These proved a highly successful 
model for reporting, say, epidemiologic stud- 
ies, or reports of assessing clinical tests’. A col- 
lection of more than 300 reporting guidelines 
have been gathered into the EQUATOR Net- 
work (www.equator-network.org), and their 
use is spreading widely among biomedical 
researchers, journals and reviewers. 

Meta-research on clinical trials has been 
further advanced by the Cochrane Collabo- 
ration, which systematically collects studies 
across disease types to weigh up the evi- 
dence. Cochrane has developed ‘risk of bias’ 
assessments to help its reviewers to evaluate 
possible weaknesses in trial reports. 


OPEN REVIEW 

Blinding of reviews is another fertile area of 
study. In 1998, my colleagues and I conducted 
a five-journal trial® of double-blind peer 
review (neither author nor reviewer knows 
the identity of the other). We found no dif- 
ference in the quality of reviews. What's more, 
attempts to mask authors’ identities were 
often ineffective and imposed a considerable 
bureaucratic burden. We concluded that the 
only potential benefit to a (largely unsuccess- 
ful) policy of masking is the appearance, not 
the reality, of fairness. Since then, online tech- 
nologies for blinding have increased, as have 
numbers of scientists (and thus the difficulty 
of guessing who authors may be). It will be 
interesting to see how similar studies work 
out now, and whether double-blind review- 
ing affects acceptance rates for women and 
under-represented minorities. 

More than a decade ago, the British Medi- 
cal Journal (BMJ) ran trials in which the 
identities of both author and reviewer were 
disclosed to each other during review, and, 
if the paper was published, the reviewers’ 
names were made public. The BMJ did not 
suffer a loss of manuscripts or reviewers, and 


ILLUSTRATION BY DAVID PARKINS 


now makes such disclosures compulsory. Its 
experience suggests that how questions are 
posed is crucial. Ifa survey asks: “Would you 
like to sign your review?’, most will decline. 
But if an editor says: “Our journal requires 
signed reviews. Will you review?’, the BM/'s 
experience is that very few will refuse’. I 
believe that this brand of open review is the 
most ethical variety, and its practicability is 
established. In the present system, authors 
frequently misidentify reviewers with com- 
plete confidence, so blame falls on innocent 
bystanders. 


THE FUTURE 

The past 15 years have seen an exciting surge 
of experimentation with new models of peer 
review — open, blinded, pre- and post-publi- 
cation, portable and so on*. Some of these sys- 
tems were tried and abandoned decades ago, 
before the Internet eased testing and logistics. 

We need rigorous studies to tell us the 
pros and cons of these approaches today. 
Until then any advertised advantages of new 
arrangements are unsupported assertions. 
A 2015 survey” of more than 1,000 manu- 
scripts was encouraging about the ability of 
review to identify important papers, but still 
found lapses. 

After all, online technologies don’t give 
reviewers more time or stamina. A common 
claim of new journals, whether legitimate or 
‘predatory’ (those that charge fees to publish, 
but that do not offer standard publishing ser- 
vices), is rapid review and publication. This is 
a powerful pull for authors, but the detailed 
attention and mature reflection required for 
a constructive review takes time. 

So what now? In my field, and perhaps 
in many others: follow the triallists. First, 


develop evidence-based lists of items to be 
included in reporting (mission-sort-of- 
accomplished for many clinical journals). 
Journals must accept and promote these 
guidelines and ensure that reviewers hold 
authors to them; perhaps they should facili- 
tate training in peer review, which has been 
shown to improve performance. Finally, man- 
uscript editors and copy editors must uphold 
the standards. For example, we now routinely 
reject trial reports that cannot prove registra- 
tion before inception. This change is large for 
all involved — authors, reviewers and journal 
staff — and it is taking years. 

And we must continue to study what we 
have done. Assessment of review is more 
likely now than ever before. The two-year- 
old Meta-Research 
Innovation Center 
(METRICS) Institute 
at Stanford University 
in California, which is 
devoted to research- 
ing and improving 
the process of science, 
shows that the field is maturing and gain- 
ing respect. So does last year’s launch of the 
journal Research Integrity and Peer Review, 
ahome for research on the topic. 

In 1986, we were lucky with our timing. 
The peer-review congresses came just as oth- 
ers were trying to see what could be learned 
from the literature to arrive at the best treat- 
ments for patients, developing methods for 
systematic review, and nailing down the 
biases that pervade clinical research (see 
‘Selecting good science’). These people did 
the work. 

To announce that first Peer Review Con- 
gress, I wrote: “There are scarcely any bars 


© 2016 Macmillan Publishers Limited. All rights reserved. 


to eventual publication. There seems to be 
no study too fragmented, no hypothesis too 
trivial, no literature citation too biased or too 
egotistical, no design too warped, no meth- 
odology too bungled, no presentation of 
results too inaccurate, too obscure, and too 
contradictory, no analysis too self-serving, 
no argument too circular, no conclusions too 
trifling or too unjustified, and no grammar 
and syntax too offensive for a paper to end 
up in print”. 

Unfortunately, that statement is still true 
today, and I'm not just talking about preda- 
tory journals. That said, I am confident that 
the Peer Review Congress scheduled for 2017 
will be asking more incisive, actionable ques- 
tions than ever before. = 


Drummond Rennie is a co-organizer 

of CONSORT, a former member of the 
Commission on Research Integrity for the US 
Public Health Service, and former president 
of the World Association of Medical Editors. 
e-mail: drummond.rennie@ucsf.edu 


1. Bailar, J. C. & Patterson, K. N. Engl. J. Med. 312, 
654-657 (1985). 
2. Olson, C. M. etal. J. Am. Med. Assoc. 287, 
2825-2828 (2002). 
3. Dickersin, K. & Rennie, D. J. Am. Med. Assoc. 290, 
516-523 (2003). 
4. Rennie, D. J. Am. Med. Assoc. 273, 1054-1055 
(1995). 
5. Begg, C. et al. J. Am. Med. Assoc. 276, 637-639 
(1996). 
6. Justice, A.C. et al. J. Am. Med. Assoc. 280, 
240-242 (1998). 
. Groves, T. Br Med. J. 341, c6424 (2010). 
. Paglione, L. D. & Lawrence, R. N. Learn. Publ. 28, 
309-316 (2015). 
9. Siler, K., Lee, K. & Bero, L. Proc. Natl Acad. Sci. 
USA 112, 360-365 (2015). 
10.Rennie, D. J. Am. Med. Assoc. 256, 2391-2392 
(1986). 


CON 


7 JULY 2016 | VOL 535 | NATURE | 33 


eniuses of place 


Ethan Carr traces the arcof influence in lands¢ape creation 
and preservation from ‘Capability Brown toFrederick Law 


Olmsted and the US National Park Service. 


coincidence of commemorative dates 
A™ this year an important one in 

the history of landscape design and 
scenic preservation. As the 300th anniver- 
sary of the birth of the landscape gardener 
Lancelot ‘Capability’ Brown is celebrated on 
one side of the Atlantic, the United States is 
marking the centenary of the National Park 
Service, the federal agency that acts as the 
steward of the nation’s most iconic natural 
areas and historic shrines. The two are con- 
nected by the complex and evolving cultural 
construction of ‘nature? its representations, 
its manifestations and its benefits. 

Brown's landscape parks expressed the 
eighteenth-century’s fascination with 
nature itself, which was increasingly the 
subject of scientific inquiry and a plethora 
of botanical and zoological discoveries. 
Nature offered templates for ordering soci- 
ety, too. When the poet Alexander Pope 
exhorted, “In all, let Nature never be forgot,’ 
he was describing more than the new style 
of landscape gardening. Brown’s composed 


34 | NATURE | VOL 535 | 7 JULY 2016 


scenes of pastoral greenswards and planted 
woodlands expressed picturesque aesthetic 
theory; they also imposed a more scientific 
and modern order on the land. 

In the United States, landscape architect 
Frederick Law Olmsted developed his own 
‘natural style’ in the nineteenth century. 
Olmsted was deeply influenced by his expe- 
riences in Britain, which he described in his 
first book, Walks and Talks of an American 
Farmer in England (1852). In the spring of 
1850, he visited Birkenhead Park in north- 
west England, noting that in “democratic 
America there was nothing to be thought 
of as comparable to this People’s Garden”. 
Olmsted also responded to the country- 
side itself, and, above all, to the landscape 
parks he visited. About the designer of the 
grounds at Eaton Hall in Cheshire, he wrote: 
“What artist, so noble... as he who, with far- 
reaching conception of beauty and design- 
ing power, sketches the outline, writes the 
colours, and directs the shadows of a picture 
so great that Nature shall be employed upon 


© 2016 Macmillan Publishers Limited. All rights reserved. 


it for generations, before the work he has 
arranged for her shall realize his intentions.” 
That artist was Brown, who had died in 
1783. His park landscapes, now mature, 
thoroughly impressed the young “Ameri- 
can farmer”. Sweeping meadows, clumps 
and belts of native (and North American) 
trees, sheets of impounded water and wind- 
ing drives were the elements that shaped the 
aesthetics and image of the “natural” in an 
urbanizing and industrializing world. 
Olmsted soon assumed the mantle of 
artist himself. At first, he worked with a 
partner: the English architect Calvert Vaux, 
with whom he won the competition for the 
design of New York City’s Central Park in 
1858. Olmsted returned to England several 
times. He expressed ambivalence towards 
Victorian design 
trends, which empha- 
sized ‘gardenesque’ 
displays of floricul- 
ture and frankly arti- 
ficial arrangements of 


> NATURE.COM 
For more on science 
in culture see: 
nature.com/ 
booksandarts 


JEFF MURRAY/AURORA 


specimen plants. Olmsted’s taste could be 
considered atavistic. Like Brown, he sought 
to create compositions of larger ‘landscape 
effects, devoid of elaborate flower gardens 
or other distractions from the fundamental 
experience of scenery. In practice, Olmsted 
created dramatic sequences of landscapes 
— expansive greenswards, serpentine 
lakes, picturesque rambles — and eschewed 
buildings, geometric layouts and flower 
beds. He would not have another kindred 
spirit in British landscape gardening until 
1870, when William Robinson — who later 


designed the grounds of Gravetye Manor in 
Sussex — published the book Wild Garden. 
Robinson visited Olmsted in New York that 
year, and the two maintained a correspond- 
ence and a mutual admiration. 


THE LIE OF THE LAND 

Olmsted was also influenced by continued 
progress in contemporary natural sciences, 
especially geology, which he knew mostly 
through the work of the researchers Louis 


Agassiz and Nathaniel Shaler at Harvard 
University in Cambridge, Massachusetts. 
With Vaux and on his own, Olmsted 
exploited existing geological formations 
in his large municipal-park designs to 
create specific effects and to structure the 
overall landscape 
composition. 

The schist bed- 
rock outcrops of 
Manhattan and the 
puddingstone con- 
glomerate of Boston, 
Massachusetts, are 
design features (and construction mat- 
erials) in Central Park and Franklin Park, 
respectively. In Brooklyn, New York, the 
terminal moraine glacial morphology of 
Long Island became the framework for 
the entire conception of Prospect Park as a 
sequence of landscape experiences, from the 
high ground of the main entrance down to 
the glacial outwash plain, in which a large 
lake was excavated. What Pope described as 


“Nature 
offered 
templates 
for ordering 
society.” 


© 2016 Macmillan Publishers Limited. All rights reserved. 


BOOKS & ARTS | COMMENT 


California’s Yosemite 
Valley was one of the 
first US national parks. 


the “genius of the place’, for Olmsted, resided 
in the landscape’s skeleton — its geological 
foundations — which he often exposed and 
highlighted, and around which landscape 
effects and overall patterns of how people 
might use the park could be structured. 

As a public intellectual, Olmsted also 
developed the political rhetoric and eco- 
nomic justifications for larger regional and 
national scenic reservations. In 1865, the 
governor of California asked him to pre- 
pare a report to guide the management of 
Yosemite Valley. This granite gorge hid- 
den in the Sierra Nevada mountains is one 
of the great geological landscapes of the 
continent. It became the site, more than 
any other, where the idea of the national 
park took shape. It was for Yosemite that 
Olmsted provided the philosophical frame- 
work for state and national park-making 
in the United States. He noted that it was 
“the main duty of government” to protect 
and provide the means for the “pursuit of 
happiness”. That pursuit, for Olmsted, > 


7 JULY 2016 | VOL 535 | NATURE | 35 


BOOKS & ARTS 


depended on preserving such places and 
creating public access to them. 

“Tt is a scientific fact; he asserted, “that the 
occasional contemplation of natural scenes 
of an impressive character ... is favorable to 
the health and vigor of men.” The govern- 
ment had a duty to assure that “enjoyment 
of the choicest natural scenes in the country 
and the means of recreation connected with 
them” be “laid open to the use of the body of 
the people”. If the government did not act, 
those places would be monopolized by the 
few and their benefits experienced only by 
an elite. The establishment of “great public 
grounds” was therefore required of a repub- 
lic that derived its authority from its people. 

There was a continuity and consistency in 
the overall purposes that Olmsted described 
for public parks and scenic reservations, as 
well as in his design recommendations for 
both. At Yosemite Valley and New York’s 
Niagara Falls (for which he and Vaux pre- 
pared the state-park plan in 1887), for 
example, the challenge was to protect the 
awesome existing features from damage by 
visitors, and to choreograph the sequence 
and pace of their visits in the design of roads, 
paths and other facilities, without marring 
the scenery with buildings. 


SHAPING DEMOCRACY 

Brown is supposed to have said, “One does 
not go up and down steps in nature’, refer- 
encing his preference for smoothly graded 
contours over retaining walls or terraces. 
In their Central Park competition entry, 
Olmsted and Vaux similarly insisted that: 
“the interest of the visitor ... should concen- 
trate on features of natural, in preference 
to artificial, beauty... Architectural struc- 
tures should be confessedly subservient to 
the main idea.” In the changed context of 
nineteenth-century, urbanizing US soci- 
ety, the main purpose of the large, public 
park (whether municipal, state or national) 
remained constant: to provide a dramatic 
sequence of affecting landscape experiences 
and effects, unencumbered by encroach- 
ments, and now made available to “the body 
of the people”. 

This rhetoric of public park-making is 
particularly important while the cente- 
nary of the National Park Service is being 
celebrated. Congress created the agency 
in 1916, giving it a famous mandate “to 
conserve the scenery and the natural and 
historic objects and the wild life” of the 
national parks, and “to provide for the 
enjoyment of the same in such manner and 
by such means as will leave them unim- 
paired for the enjoyment of future genera- 
tions”. This key portion of the legislation 
was written by Frederick Law Olmsted Jr, 
who continued his father’s professional 
practice in the twentieth century, and 
who was directly inspired by his father’s 


36 | NATURE | VOL 535 | 7 JULY 2016 


et y 


i) 4 ee 


TEYAG? Se 


Yosemite report in drafting the park- 
service bill. 

Congress had created national parks in the 
mid-nineteenth century — notably Yosemite 
in 1864 and Yellowstone, Wyoming, in 1872. 
But the far-flung group of about 35 reserva- 
tions had remained relatively inaccessible to 
most people. That changed with the advent 
of affordable and reliable automobiles. The 
park service was created to better manage 
both the great potential for public enjoyment 
and the great peril to the parks presented by 
vastly increased numbers of tourists in cars. 

Today, there are more than 400 ‘units’ in 
the national-park system, including scores 
of historic sites, memorial landscapes and 
archaeological sites, in addition to the better- 
known large-scale wilderness reservations. 
The national parks are often characterized as 


© 2016 Macmillan Publishers Limited. All rights reserved. 


Top: Capability Brown’s garden at Blenheim Palace, UK; bottom: Central Park, New York. 


‘America’s best idea, a bromide that obscures 
as much as acknowledges their significance 
and origins. The idea was rooted in the 
nineteenth-century park movement, and 
therefore in the thought reflected in the 
elder Olmsted’s writings, and embodied in 
his designs. These in turn have unambiguous 
links to that “artist so noble’, born 300 years 
ago, Capability Brown. m 


Ethan Carr is a landscape historian 

and preservationist specializing in 

public landscapes at the University of 
Massachusetts Amherst. He is the author 
of Wilderness by Design and Mission 66, 
and the volume editor of The Papers of 
Frederick Law Olmsted, Volume VIII: The 
Early Boston Years, 1882-1890. 

e-mail: ecarr@umass.edu 


TOP: OLAF PROTZE/LIGHTROCKET/GETTY; BOTTOM: MICHAEL YAMASHITA/NGC 


Correspondence 


Stop slaughter of 
migrating songbirds 


A newstrategy is needed to stop 
the illegal trapping and killing of 
millions of songbirds every year 
in the Mediterranean region, 
where gigantic vertical nets 
intercept major migration flyways 
(see also Nature 529, 452-455; 
2016). In the western Mahgreb 
in North Africa, this carnage is 
collateral damage to the area's 
cultural fancy for pet goldfinches 
(Carduelis carduelis), which 
dates back to around 700 and the 
Umayyad dynasty. 

The goldfinch has only recently 
been officially protected in 
Algeria, Tunisia and Morocco, 
where its populations have 
been declining rapidly over the 
past two decades. The price of 
a single live bird (pictured) is 
now US$50-500, equivalent 
to 25-250% of the typical local 
monthly salary. This has caused 
trapping and by-catch to escalate. 
Many of the captured goldfinches 
perish under poor transport 
conditions. 

I suggest that local people 
should be taught to divert their 
admiration for the goldfinch’s 
charms into ensuring its 
protection. Netting would stop if 
instead the goldfinch became an 
emblematic conservation symbol 
of the region, and a ‘safety 
umbrella for other migrating 
Palaearctic songbirds (see 
J.-M. Roberge and P. Angelstam 
Conserv. Biol. 18, 76-85; 2004). 
Rassim Khelifa Université 
Mouloud Mammeri de Tizi 
Ouzou, Algeria. 
rassimkhelifa@gmail.com 


Don’t undervalue 
the social sciences 


Too many physicists, chemists 
and biologists perceive the social 
sciences and humanities as less 
rigorous and less intellectually 
demanding domains than their 
own. Research into expert 
performance calls these attitudes 
into question. 


Thousands of hours of 


deliberate practice are needed 

to become highly competent 

in any endeavour that requires 
skill (see The Cambridge 
Handbook of Expertise and 
Expert Performance Cambridge 
Univ. Press; 2006). Moreover, 

the time invested before making 
a world-class contribution to 

any major field is similar, be it in 
chess, music, basketball, history 
or flying a plane (A. Ericsson and 
R. Pool Peak Bodley Head; 2016). 

So, distinguished scholars 
from different fields are likely 
to be comparably proficient in 
the skills relevant to their work. 
Assuming that top researchers 
have devoted roughly the same 
amount of effort to developing 
their domain-specific skills, 
the wider implication is that 
different fields are roughly 
equally advanced in terms of 
dealing with the challenges 
they face. 

If physics, say, seems more 
developed than social science, 
then this may be because the 
field has been established for 
longer or that the challenges 
are easier to overcome. 

Brian Martin University of 
Wollongong, Australia. 
bmartin@uow.edu.au 


US panel risks infant 
and researcher lives 


As the chief executives of the 
biotech companies Ganogen 
and StemExpress, we are among 
a broad sweep subpoenaed — 
along with scientists, graduate 


students and physicians also 
engaged in research involving 
fetal tissue — by the US House 
Select Investigative Panel on 
Infant Lives. In our view, this 
witch-hunt endangers infants 
and researchers and must end. 

The panel's stated aim is to “get 
the facts about medical practices 
of abortion service providers 
and the business practices of 
the procurement organizations 
who sell baby body parts”. On 
1 June, it released the names, 
addresses, e-mail contacts and 
telephone numbers of many 
of us in an open letter to the 
US Department of Health and 
Human Services. We consider 
this to be a callous disregard of 
the threat posed by activists to 
medical researchers who are in 
fact engaged in saving young 
lives (see Nature Biotechnol. 34, 
445; 2016). 

Research involving fetal 
tissue led to vaccines against 
polio, rubella and chickenpox. 

It was central to proving the link 
between Zika virus and infant 
microcephaly (H. Tang et al. Cell 
Stem Cell 16, 587-590; 2016), 
and is essential for developing a 
vaccine against the virus (Nature 
532, 16; 2016). The chair of the 
panel, Representative Marsha 
Blackburn of Tennessee, should 
note that her constituents, and 
those of committee members 
Diane Black (Tennessee) and 
Vicki Hartzler (Missouri), are 
especially vulnerable to Zika 
because the mosquito vector, 
Aedes aegypti, is more prevalent 
in the southern states. 

Eugene Gu Ganogen Research 


Institute, Redwood City, USA. 
Cate Dyer StemExpress, 
Placerville, USA. 
eugenegu@ganogen.org 


Food security needs 
social-science input 


As members of the Climate- 
Resilient Open Partnership 
for Food Security project 
supported by the World Wide 
University Network (see 
go.nature.com/28ygwtc), 

we contend that basic social- 
science theory and methods 
could transform interventions 
aimed at improving food 
production. 

Food security calls for 
agricultural advances, 
adaptation to climate change 
and more efficient use of natural 
resources. Just as important 
are the social and political 
considerations of reforming 
food production and 
distribution systems. 

All too often, poor 
communication between the 
scientific community and the 
public, including potential 
users, impedes utilization 
of new technologies. Social 
networks, power inequalities and 
institutional resistance to change 
must all be taken into account if 
the system is to be reformed (see 
W. W. Powell et al. in The Science 
of Science Policy 31-55, Stanford 
Univ. Press; 2011). 

We therefore suggest that 
research consortia in food 
security and their funding 
agencies should include social 
scientists from the outset (see 
A. Viseu Nature 525, 291; 
2015). This would dramatically 
enhance project management 
and conceptual development 
by dealing with the complex 
interactions between natural and 
social factors. 

Klaus Niisslein*, Om Parkash 
Dhankher* University of 
Massachusetts Amherst, USA. 
nusslein@microbio.umass.edu 
*Supported by 13 signatories (see 
go.nature.com/299szyy). 


7 JULY 2016 | VOL 535 | NATURE | 37 


© 2016 Macmillan Publishers Limited. All rights reserved. 


For News & Views online, go to 
nature.com/newsandviews 


ASTROPHYSICS 


Rare data from a lost satellite 


The Hitomi astronomical satellite observed gas motions in the Perseus galaxy cluster ey before losing contact with 


Earth. Its findings are invaluable to studies of cluster physics and cosmology. S$ 


ELIZABETH BLANTON 


easurements of motions in the 
hot gas that lurks in clusters of 
galaxies provide insight into the 


level of turbulence in these objects, and into 
larger-scale flows that are related to a cluster’s 
merger history and to outflows from its 
central supermassive black hole. The amount 
of turbulence affects measurements of clus- 
ter mass, which are used to constrain values 
of the cosmological parameters that govern 
our Universe. Indirect methods have been 
used previously to infer these gas motions, 
but on page 117, the Hitomi collaboration 
(Aharonian et al.') reports direct measure- 
ments of gas motions in the Perseus cluster 
using high-resolution spectra acquired by a 
type of detector that was unique to the Japanese 
Hitomi X-ray astronomical satellite; similar 
detectors are not available on any other active 
X-ray satellite. The Perseus cluster was the 
only cluster to be observed by Hitomi before 
the satellite’s premature demise’, which means 
that such observations will be rare for the 
foreseeable future. 

Clusters of galaxies are the largest 
gravitationally bound objects in the Universe, 
and are crucial probes of galaxy evolution and 
of cosmology. In the currently favoured lambda 
cold dark matter model of the Universe, struc- 
ture forms in a ‘bottom-up fashion — smaller 
structures form first and then merge to gen- 
erate larger ones. Individual galaxies and 
small groups of galaxies therefore form before 
merging into more-massive clusters. 

Cosmological models vary, for example, 
by including different amounts of dark matter 
and dark energy in the Universe, and thus 
predict different cluster-formation histories. 
The observed mass distribution of clusters as a 
function of time over the evolution of the Uni- 
verse can place constraints on these models** 
But the masses of individual clusters must be 
measured robustly to place these constraints 
with high accuracy. 

Clusters of galaxies typically contain from 
about fifty to thousands of galaxies, along with 
dark matter and diffuse, hot (about 10’-10° kel- 
vin) gas that emits X-ray radiation. One way 
of measuring a cluster’s mass is to analyse the 
emission from this gas, which can be assumed 


40 | NATURE | VOL 535 | 7 JULY 2016 


Figure 1 | X-ray image of the Perseus cluster of galaxies taken by the Chandra observatory. The 
central bubbles (dark regions) and ripples are associated with outflows from material that surrounds 

the cluster’s central supermassive black hole. On the basis of measurements taken by the now lost Hitomi 
satellite, Aharonian et al.' report that turbulence in the central region of the cluster is low, which suggests 
that errors in measurements of the masses of galaxy clusters using X-ray observations are small. The 
image is approximately 460,000 parsecs (1.5 million light years) across. 


to be in hydrostatic equilibrium — that is, 
the pressure gradient of the gas in the out- 
ward direction is balanced by gravity pulling 
inward; this gravity depends directly on the 
total mass of the cluster. The gas pressure can 
be readily determined from measurements 
of gas density and temperature, which are 
acquired using data from existing X-ray space 
observatories such as the Chandra X-ray 
Observatory and XMM-Newton. 

But turbulent motions in the gas can 
potentially add to the pressure, and neglect- 
ing their contribution can cause errors in mass 
determinations. Measuring these motions 


© 2016 Macmillan Publishers Limited. All rights reserved. 


directly requires high-resolution spectroscopy 
of the diffuse, hot gas, which can extend 
across millions of light years in space. Indirect 
constraints on turbulent gas motions have 
previously been made using other methods”®, 
and upper limits have been set on the basis of 
spectroscopic measurements taken by XMM- 
Newton’. The Hitomi X-ray observatory was 
the only active facility to carry a type of instru- 
ment called a calorimeter that could precisely 
measure motions in the gas through changes 
in frequency (Doppler shifts) and broadening 
of emission lines in spectral data. 

The Hitomi collaboration chose the Perseus 


NASA/CXC/STANFORD/I. ZHURAVLEVA ET AL. 


cluster of galaxies as an early observation target 
because it is the brightest X-ray-emitting 
cluster in the sky and has been studied 
extensively by other orbiting X-ray observa- 
tories, including Chandra, XMM-Newton 
and Suzaku®”. It therefore provides an 
excellent test case for measuring motions in its 
hot gas. It is considered to be ‘relaxed’, mean- 
ing that it has not undergone a large-scale 
merger with another massive cluster in billions 
of years. The central galaxy in the cluster hosts 
a supermassive black hole, and outflows of 
high-energy particles orbiting the black hole 
have inflated large bubbles in the cluster’s 
diffuse gas (Fig. 1). The goals of the study were 
to measure bulk velocities related to these 
outflows, as well as turbulence. 

The main result is that the velocities of the 
gas are quite low, approximately 150 kilo- 
metres per second. A notable implication of 
this is that the additional contribution to the 
pressure that is associated with turbulence is 
constrained to be only a few per cent of the 
thermal pressure (the main component of the 
total pressure). This means that measurements 
of cluster mass based on X-ray observations of 
hot gas, assuming hydrostatic equilibrium and 
neglecting turbulent pressure, will have only 
small associated errors. This is good news for 
studies that use the masses as the basis for con- 
straining cosmological parameters”. 

However, these measurements were made 
for only one cluster and only in the cluster’s 
central region, and are therefore not necessarily 
applicable to clusters in general. In addition, the 
observations were made early in the mission, 
before all of the associated calibration proce- 
dures were available. The limitations on the 
available calibrations translate to an increase in 
the uncertainties in the measurements, particu- 
larly in the systematic errors in the line-of-sight 
velocities. Nevertheless, even with such uncer- 
tainties, these cluster gas velocities are the most 
precise yet measured. 

Future missions — including the European 
Athena X-ray Observatory, scheduled for 
launch in 2028, and the possible US X-ray 
Surveyor mission — should allow further 
insight into the gas motions in clusters of gal- 
axies. It would be useful eventually to measure 
velocities ina range of cluster environments, 
such as cluster cores and outer regions, clus- 
ters with and without bubbles that result from 
outflows around supermassive black holes, 
and clusters in various stages of mergers. 
The Perseus observations from Hitomi have 
given us an important first look at the gas 
motions in a galaxy cluster, but many more 
exciting environments and details remain 
to be explored. m 


Elizabeth Blanton is at the Institute for 
Astrophysical Research and the Department 
of Astronomy, Boston University, Boston, 
Massachusetts 02215, USA. 

e-mail: eblanton@bu.edu 


1. The Hitomi collaboration. Nature 535, 117-121 
(2016). 

2. Witze, A. Nature 533, 18-19 (2016). 

3. Allen, S. W., Evrard, A. E. & Mantz, A. B. Annu. Rev. 
Astron. Astrophys. 49, 409-470 (2011). 

A. Vikhlinin, A. et al. Astrophys. J. 692, 1060-1074 
(2009). 

5. Zhuravleva, |. et al. Mon. Not. R. Astron. Soc. 450, 


NEWS & VIEWS | RESEARCH | 


4184-4197 (2015). 

6. Randall, S. W. et al. Astrophys. J. 805, 112 (2015). 

7. Sanders, J. S., Fabian, A. C. & Smith, R. K. Mon. Not. 
R. Astron. Soc. 410, 1797-1812 (2011). 

8. Fabian, A. C. et al. Mon. Not. R. Astron. Soc. 418, 
2154-2164 (2011). 

9. Simionescu, A. et al. Astrophys. J. 757, 182 
(2012). 


In search of the 
memory molecule 


The protein PKM-6¢ has been proposed to regulate the maintenance of memory in 
rodents, but this theory has been questioned. The finding that another isoform of 
the protein acts as a backup if PKM -¢ is lacking will influence this debate. 


PAUL W. FRANKLAND & SHEENA A. JOSSELYN 


n understanding of memory has long 

been a goal of neuroscience. One 

question that has attracted particular 
attention is whether there is a specific molecule 
that maintains memories. After almost two 
decades of careful work, neuroscientist Todd 
Sacktor and colleagues thought they had the 
answer. In 2006, the authors reported! that an 
atypical isoform of the enzyme protein kinase 
C, called PKM-C, was involved in maintain- 
ing memories in mice, and that an inhibitor 
of PKM-¢ could erase memories. The results 
were subsequently questioned””, and contro- 
versy ensued. Writing in eLife, the same group 
that performed the 2006 study opens a new 
chapter in this debate’, arguing that PKM- 
should be restored to its pre-eminent status as 
the memory molecule. 

More than halfa century ago, the psychologist 
Donald Hebb proposed that the synaptic 
connections between two neurons are 
strengthened when the neurons fire together’. 
He suggested that this form of synaptic 
strengthening provided the basis for the for- 
mation of long-term memories, enabling many 
neurons to be linked together in cell assem- 
blies that serve as the physical substrates of 
memory, called engrams. It was later discov- 
ered’ that high-frequency neural stimulation 
led to persistent increases in synaptic strength, 
knownas long-term potentiation (LTP). Most 
neuroscientists embraced the idea that under- 
standing LTP was the key to understanding 
memory’. The race was on to identify the 
molecular machinery involved in LTP. 

One molecule in particular emerged from 
the fray. Although dozens of molecules were 
involved in initial synaptic strengthening 
following high-frequency stimulation, only 
PKM-¢ seemed to be crucial for maintaining 
these strengthened connections*. In PKM-C, 


© 2016 Macmillan Publishers Limited. All rights reserved. 


therefore, the activity ofa single molecule was 
linked to the persistence of memory. Subse- 
quently, several experiments showed that 
inhibition of PKM-¢ after memory formation 
(for example, by using a 13-amino-acid pro- 
tein fragment called ZIP, which mimics the 
natural substrate that inactivates PKM-C) led 
to memory erasure’”. 

However, enthusiasm surrounding PKM-¢ 
waned dramatically following the discovery 
that mice in which PKM-¢ had been deleted 
showed normal LTP and memory”’. More 
puzzling still, ZIP produced LTP-reversing and 
memory-erasing effects in mice that lacked 
PKM.-(, similar to its effects in normal mice 
that expressed the enzyme. The amnesiac 
effects of ZIP, therefore, must be acting 
through another mechanism. 

Do these results indicate that PKM-C is 
not necessary for memory? Much of the ini- 
tial amour surrounding the 2013 papers” 
focused on this possibility. It seems unlikely, 
however, because more than one method for 
inhibiting PKM-C erases memories’. An alter- 
native possibility is that PKM-C has an essential 
role in LTP maintenance and memory per- 
sistence in normal mice, but compensatory 
processes that are sensitive to ZIP emerge in 
PKM-(-deficient mice. 

This brings us to the detective work of the 
current study. Tsokas et al.* first confirmed 
that ZIP reversed LTP in both normal and 
PKM.-¢-deficient mice, indicating that trivial 
procedural differences could not resolve the 
controversy. Next, the authors showed that 
induction of LTP produced an increase in 
PKM.-C in slices taken from the hippocampal 
region of the brains of normal mice, and that 
there was a sustained increase in another 
atypical protein kinase C isoform, PKC-1/A, in 
slices from PKM-¢-deficient mice. Moreover, 
injecting either PKC-v/A or PKM-C directly into 
hippocampal CA1 pyramidal neurons induced 


7 JULY 2016 | VOL 535 | NATURE | 41 


| RESEARCH | NEWS & VIEWS 


Learning + LTP 


a_ Wild-type mouse PKM-Z + 


b PKM-Z mutant PKC-1/Mt 


Figure 1 | Memory loss modulated. In place-avoidance tests, mice learn 
that they will receive a foot shock if they move over a certain part of a rotating 
test arena. During this learning, the synaptic connections between neurons 
are strengthened in a process called long-term potentiation (LTP), which is 
required for memory formation. Tsokas et al.” investigated how two atypical 
isoforms of the enzyme protein kinase C— PKM-¢ and PKC-1/A — regulate 
memory maintenance following LTP induction. a, In wild-type mice, levels of 


LTP inslices from normal mice. ZIP treatment 
reversed the effects of either protein injection, 
hinting that PKC-1/\ might be the mystery 
molecule that compensates for loss of PKM-C. 

To test this idea directly, Tsokas and col- 
leagues inhibited either PKM-( or PKC-1/A and 
examined LTP in hippocampal slices (Fig. 1). 
In slices from control mice, inhibiting PKM-¢ 
blocked LTP, but PKC-1/A inhibition had no 
effect. By contrast, in PKM-C-deficient mice, 
inhibiting PKC-1/A blocked LTP, but PKM-¢ 
inhibition was ineffective. The same pattern 
emerged when the authors examined the 
effects of PKC-1/A and PKM -C¢ inhibition on 
memory in control and PKM-¢-deficient mice. 

Do these latest results restore the position 
of PKM-( as the leading memory molecule? 
The allure of the PKM-C¢ theory is the idea that 
a single molecule is responsible for maintain- 
ing LTP and memories. The current findings 
are not inconsistent with this view. However, 
in their experiments, Tsokas et al. inhibited 
PKM.-C in normal mice before (rather than 
after) LTP and memory induction. This means 
that they cannot directly evaluate the enzyme’s 
role in the persistence of LTP and memory. 

The PKM-C saga serves as a cautionary tale 
about the specificity of the tools that we use 
to examine brain function and establish cau- 
sality. The controversy exposed the bluntness 
of ZIP as a tool for probing PKM-C function 
because it clearly affects other molecules and 
may even lead to neuronal silencing”. Equally, 
seemingly more specific interventions, such as 
genetic deletion of PKM-C, produced a cascade 
of unintended compensatory changes, which 
clouded interpretations and masked predicted 
outcomes. This limitation is not restricted to 
genetic mutations, but extends to any inter- 
vention that perturbs brain function (such 
as optogenetic or chemogenetic strategies in 
which genetically introduced proteins can 
be activated and inhibited in response to light 
or drugs). 


42 | NATURE | VOL 535 | 7 JULY 2016 


LTP + memory loss 


LTP + memory persistence 


PKM-Z inhibition 


PKC-I/A inhibition 


As the PKM-¢ debate rumbles on, there 
is a broader mystery to consider. Molecular 
neuroscientists such as Tsokas and colleagues 
present a static view of the engram, in which 
patterns of synaptic changes that are initiated 
during memory encoding are maintained over 
the lifetime of the memory. By contrast, sys- 
tems neuroscientists present a more dynamic 
picture, emphasizing memory maintenance in 
the midst of broad changes in the synapses’! 
and even the neurons” that correspond to the 
engram. A full account of memory persistence 
needs to merge these molecular and systems 
perspectives, allowing the twain to meet. m 


Paul W. Frankland and Sheena A. Josselyn 
are in the Program in Neurosciences and 
Mental Health, The Hospital for Sick Children, 
Toronto, Ontario M5G 1X8, Canada. They 
are also in the Departments of Physiology and 


CHEMICAL PHYSICS 


PKC-I/A inhibition 


PKM-@ inhibition 


PKM-C rise following learning. Inhibition of PKM-C in these mice causes loss 
of LTP and hence loss of memory, so the mice forget how to avoid a shock. By 
contrast, inhibition of PKC-1/A has no effect on memory of the learned activity. 
b, In mice that lack the gene encoding PKM-¢, PKC-v/ is elevated following 
LTP induction. Inhibition of PKC-1/\ causes LTP and memory loss, whereas 
PKM.-C inhibition has no effect. Thus PKM-C is the main substrate for memory 
maintenance in normal conditions, but PKC-1/A can compensate in its absence. 


Psychology and at the Institute of Medical 
Science, University of Toronto, Toronto. 


1. Pastalkova, E. et al. Science 313, 1141-1144 
(2006). 

2. Lee, A.M. et al. Nature 493, 416-419 (2013). 

3. Volk, L. J., Bachman, J. L., Johnson, R., Yu, Y. & 
Huganir, R. L. Nature 493, 420-423 (2013). 

4. Tsokas, P. et al. eLife 5,e14846 (2016). 

5. Hebb, D. O. The Organization of Behavior: A 
Neuropsychological Theory (Wiley, 1949). 

6 Bliss, T. V. P. & Lamo, T. J. Physiol. (Lond.) 232, 

331-356 (1973). 

7. Stevens, C. F. Neuron 20, 1-2 (1998). 

8. Ling, D.S.F. et al. Nature Neurosci. 5, 295-296 

(2002). 

9. Shema, R. et al. Science 331, 1207-1210 (2011). 

10.LeBlancq, M. J., McKinney, T. L. & Dickson, C. T. 

J. Neurosci. 36, 6193-6198 (2016). 

11.Attardo, A., Fitzgerald, J. E. & Schnitzer, M. J. Nature 

523, 592-596 (2015). 

12.Rubin, A., Geva, N., Sheintuch, L. & Ziv, Y. eLife 4, 
e12247 (2015). 


This article was published online on 29 June 2016. 


Quantum control of 
light-induced reactions 


An investigation of how ultracold molecules are broken apart by light reveals 
surprising, previously unobserved quantum effects. The work opens up avenues 
of research in quantum optics. SEE LETTER P.122 


DAVID W. CHANDLER 


he rupture of molecular bonds by the 

| absorption of light drives chemistry in 
the atmosphere, causes DNA damage 

and the associated repair response, and pro- 


vides a superb tool to study how molecules 
absorb light and then distribute and dispose 


© 2016 Macmillan Publishers Limited. All rights reserved. 


of its energy. On page 122, McDonald et al.’ 
report their study of the light-induced break- 
up (photodissociation) of ultracold strontium 
molecules, Sr,. Their work provides insight into 
how molecules behave in the quantum regime 
of ultralow-energy dynamics that occurs just 
above energy thresholds for photodissociation. 

Early photodissociation studies focused on 


the energetics” of the products formed from 
diatomic molecules, and of the products’ angu- 
lar distribution’ — the distribution of angles 
at which they recoil relative to the direction of 
polarization (the polarization axis) of the light 
that excited them. Ifthe energy of the photon 
absorbed by the diatomic molecule and the 
velocities of the resulting atomic fragments 
were known, then the bond energy of the 
molecule could be directly determined. The 
accuracy of these determinations depended 
on how cold the molecule was initially, and 
on how accurately one could measure the 
velocities of the products. 

In the early experiments*, diatomic 
molecules were irradiated with laser light, 
and if the fragments were found to fly pre- 
dominantly parallel to the laser polariza- 
tion axis, then the transition dipole moment 
responsible for the light absorption was 
said to be parallel; similarly, perpendicular 
transitions were named after the associated 
perpendicular recoil. The transition dipole 
moment describes coupling between the two 
electronic states responsible for light absorp- 
tion, and this classification was helpful in 
understanding its nature. For polyatomic 
molecules, the transition dipole moment does 
not have to align with a particular molecu- 
lar axis, and many factors affect the meas- 
ured angular distribution of the fragments. 
Measurements of the velocities of fragments 
provide information about the dynamics of the 
energy deposited within molecules as it evolves 
into the kinetic energy of the fragments. 

Hundreds of photodissociation studies have 
been performed because of the fundamental 
information that can be obtained. With the 
advent of laser-based imaging techniques” *in 
the late 1980s, it became possible to measure 
velocities at high resolution (approximately 
a few metres per second) for particular elec- 
tronic states of the products, by projecting the 
ionized products onto position-sensitive ion 
detectors. However, these experiments typically 
used pulsed-dye lasers (which produce light at a 
low frequency resolution of about 3,000 mega- 
hertz) to dissociate molecules and detect the 
products. This precludes experiments such as 
those performed by McDonald and colleagues, 
in which molecules are dissociated by photons 
that have a much higher, 1 MHz frequency 
resolution and energies just above the dissocia- 
tion threshold of the molecule (that is, at light 
frequencies between 5 and 400 MHz greater 
than the dissociation-threshold frequency). 

Moreover, these experiments typically used 
supersonic molecular beams as a source of 
cool molecules. When a high-pressure gas is 
expanded into a vacuum to form a molecular 
beam, the flow is directed forward supersoni- 
cally at the expense of the kinetic energy asso- 
ciated with the other directions of flight and 
with the gas’s internal degrees of freedom (the 
rotational and vibrational motion of its mol- 
ecules). This allows molecules to be cooled 


NEWS & VIEWS | RESEARCH | 


Figure 1 | Quantum effects in photodissociation. McDonald et al.' studied the light-induced 
fragmentation (photodissociation) of diatomic strontium molecules, Sr,, and observed surprising angular 
distributions of the resulting products. The left-hand panel shows a two-dimensional representation 

of the angular distribution of fragments obtained from Sr, in a particular rotational quantum state, as 
predicted by quasiclassical theory; hot colours indicate higher distributions of fragments. The right- 

hand panel indicates the experimentally observed pattern, which can be explained only by using a full 
quantum-mechanical description of photodissociation. 


to temperatures of a few kelvin even though 
they fly at close to velocities of 1,000 m s"', 
with a spread of about 50 ms’. McDonald 
and co-workers, however, wanted to study 
photodissociation fragments moving at only 
about 1 ms” (extremely slowly for a molecule, 
and correlating with a temperature of tens of 
millikelvin). To see such slow fragments, the 
authors held their molecules in a stationary 
laser trap, photodissociated them using a light 
pulse and then imaged the fragments after they 
had flown for about a hundred microseconds. 

Molecules can interact with light through 
either the light’s oscillating electric field (which 
causes electric dipole transitions) or its oscil- 
lating magnetic field (magnetic dipole transi- 
tions). For most covalently bound molecules, 
the light intensity required to produce electric 
dipole transitions is a million times less than 
that required for magnetic dipole transitions. 
McDonald et al. are the first to have excited 
a pure magnetic transition and observed the 
fragments. This was possible because the Sr, 
molecules in this study are formed in the high- 
est vibrational energy levels of the molecule’s 
ground state, and therefore have a very long 
bond length, which increases the magnetic 
transition dipole moment by approximately 
1,000-fold’. 

Another groundbreaking feature of McDon- 
ald and colleagues’ work is that the Sr, mol- 
ecules were prepared in a single rotational and 
vibrational quantum state by the laser-induced 
association of ultracold atoms, in the presence 
ofan oriented magnetic field. Each state repre- 
sents the projection (M) ofa molecule’s angular 
momentum vector (J) onto a quantization axis 
(in this case, the quantization axis aligns with 
the magnetic field). Several M states exist for 
each J value, and in the absence of a magnetic 
field they have the same energy (they are said 
to be degenerate); the number of M states is 


© 2016 Macmillan Publishers Limited. All rights reserved. 


defined by the formula 2J+ 1. When J is zero, 
it has no magnitude and no alignment in space. 
Ina magnetic field, the M states do not have the 
same energy, because rotating electrons create 
a magnetic field that can be either aligned or 
counteraligned with the external magnetic- 
field quantization axis. 

The authors’ experiments started from 
a single (J,M) quantum state formed in the 
laser-association process. All of the quantum 
states reached during photodissociation were 
dictated by the starting state, and by the laser 
frequency and polarization relative to the mag- 
netic field’s axis. When the researchers obtained 
a single excited quantum state, they observed 
fragments recoiling predominantly paral- 
lel or perpendicular to the laser polarization. 
But if several degenerate quantum states were 
excited and interfered with each other, then the 
observed velocity distribution deviated spec- 
tacularly from purely parallel or perpendicular. 
These unexpected and previously unobserved 
angular distributions can be described only by 
a full quantum-mechanical treatment of the 
light-absorption process (Fig. 1). 

At present, this sort of experiment is limited 
to a few diatomic molecules — some of which, 
like Sr,, are not covalently bound — that can be 
generated by cold-atom techniques. However, 
there is much to be learnt from these studies, 
and as scientists learn to cool and trap a larger 
array of covalently bound molecules, the tech- 
niques developed and knowledge gained will 
provide the foundation for future research — 
for example, in polyatomic molecules. The 
photo-physics of polyatomic molecules 
is more complex than that of diatomic 
molecules, because multiple mechanisms 
couple their electronic states to each other, and 
several fragmentation pathways are possible. In 
the meantime, I personally found this article a 
joy to absorb. = 


7 JULY 2016 | VOL 535 | NATURE | 43 


| RESEARCH | NEWS & VIEWS 


David W. Chandler is at the Combustion 
Research Facility, Sandia National Laboratories, 
Livermore, California 94550, USA. 

e-mail: chand@sandia.gov 


1. McDonald, M. et al. Nature 535, 122-126 (2016). 


CONSERVATION 


2. Busch, G. E, Mahoney, R. T., Morse, R. |. & Wilson, K. R. 
J. Chem. Phys. 51, 449-450 (1969). 

3. Solomon, J. J. Chem. Phys. 47, 889-895 (1967). 

4. Zare, R. N. & Herschbach, D. R. Appl. Opt. (Suppl.) 4, 
193-200 (1965). 

5. Chandler, D. W. & Houston, P. L. J. Chem. Phys. 87, 
1445-1447 (1987). 


The rainforest’s 
‘do not disturb’ signs 


Astudy reveals that human-driven disturbances in previously undisturbed Amazon 
rainforest can cause biodiversity losses as severe as those of deforestation. Urgent 
policy interventions are needed to preserve forest quality. SEE LETTER P.144 


DAVID P. EDWARDS 


s we enter the Anthropocene, a 
Are geological epoch shaped by 

human activity, mankind is driving a 
global biodiversity extinction crisis’. The con- 
version of forest to agricultural land is widely 
considered to be the leading cause of this 
crisis, especially in the hyperdiverse tropics’, 
so avoiding deforestation is the predomi- 
nant strategy for biodiversity conservation’. 
On page 144, Barlow et al.* present a land- 
mark field study of Amazonian biodiversity 
in which they challenge the adequacy 
of this strategy by demonstrating the 
striking magnitude of several types of human- 
associated forest disturbance that are less 
immediately visible than deforestation. 

Many studies have identified the negative 
effects on biodiversity of individual kinds of 
disturbance in tropical forests. These include 
the hunting of large animals’, the selective 
logging of large, marketable trees’, forest fires’ 
and the creation of new edges to primary 


Livestock farming 


Forest fires 


onthe | = 


Ww AN 


forests (those forests that have never been 
fully cleared) that, owing to deforestation, are 
buffeted by the hotter, drier and windier con- 
ditions found on adjacent farmland* (Fig. 1). 
However, by focusing on only one form of dis- 
turbance, such studies may have overlooked 
much greater conservation losses from the 
combined effects of forest disturbances. 

Barlow and colleagues conducted biodiver- 
sity censuses across multiple landscapes and 
then developed a computational method for 
evaluating conservation losses (termed the 
‘conservation value deficit, a numerical value 
calculated by assessing biodiversity in dis- 
turbed primary forests relative to undisturbed 
ones). This enabled the authors to quantify 
the direct negative effects of deforestation and 
those resulting from the plethora of other types 
of forest disturbance. 

The authors assembled an impressive 
data set collected across a large region of the 
Brazilian Amazon. They sampled 36 catchments 
(each 32-61 square kilometres in size) contain- 
ing small rivers, spanning Belém and Tapajés, 


Selective 
logging 


6. Eppink, A. T. J. B. & Parker, D. H. Rev. Sci. Instrum. 
68, 3477-3484 (1997). 

7. Heck, A. J. R. & Chandler, D. W. Annu. Rev. Phys. 
Chem. 46, 335-372 (1995). 

8. Ashfold, M. N. R. et al. Phys. Chem. Chem. Phys. 8, 
26-53 (2006). 

9. McGuyer, B. H. et al. Nature Phys. 11, 32-36 (2015). 


two major regions of endemism — areas that 
contain species that are found nowhere else. 
Each sample catchment varied in the degree of 
disturbance: 5 were entirely deforested, whereas 
the other 31 contained varying amounts of 
remnant forest, including undisturbed primary 
forests, and primary forests that had been dis- 
turbed by hunting, selective logging or fires, or 
isolated by surrounding farmland. Sampling 
the biodiversity across each catchment, Barlow 
and colleagues encountered a breathtaking 
total of 1,538 plant species, 460 bird species 
and 156 dung beetle species. 

Their findings make for uncomfortable 
reading. Even catchments that retained 80% of 
their forest cover — the maximum that can be 
required of Amazonian estates under Brazil’s 
Forest Code legislation — lost between 39% 
and 54% of their conservation value, and about 
half of this loss is due to disturbance within the 
remaining forest areas, rather than the losses 
from conversion to farmland. By extrapolat- 
ing these disturbance-driven losses across the 
state of Para, which represents 25% of the entire 
Brazilian Amazon, the authors found that con- 
servation losses from disturbance are equivalent 
to the losses that would result from deforesting 
92,000-139,000 km/ of primary forest — an area 
roughly equivalent to the size of Greece. 

They also found that species with higher 
conservation importance were more negatively 
affected by forest disturbance. Bird species that 
were restricted to small regions (with small 
global range sizes) fared worse than those with 
larger distributions, suggesting that forest dis- 
turbance is homogenizing biodiversity across 
regions’. Tree species with high wood density 
declined more than those with softer wood, 


a “ 
hb ep - 
= 


Hunting 


| 


Figure 1 | Forest disturbance drives major conservation losses. Barlow et al.’ report that the combined effects of various human-driven disturbances in the 
forests of the Brazilian Amazon can cause biodiversity losses on a scale similar to, or greater than, those caused by deforestation alone. Conversion to farmland 
can result in biodiversity loss and make forests more vulnerable to edge effects, such as the hot and windy conditions that can drive forest fires, which often ignite 
from farmland fires. Within the remaining rainforest, biodiversity can be affected by bushmeat hunting or selective logging. 


44 | NATURE | VOL 535 | 7 JULY 2016 


© 2016 Macmillan Publishers Limited. All rights reserved. 


degrading the ability of disturbed forests to 
store carbon, thereby driving climate change”’. 

Barlow and colleagues’ study has three 
limitations worthy of comment, each of 
which probably means that their assessment 
of biodiversity loss from disturbance is con- 
servative. First, the extensive human activity 
in the study regions means that many of their 
undisturbed forest plots could have suffered 
from low-level or historical disturbances 
that were not detected. Although the authors 
attempted to correct for this potential bias in 
their analytical approach, if they had sampled 
truly remote and undisturbed forests instead, 
densities of the most sensitive species would 
probably have been higher, and the conser- 
vation value deficit would thus have been 
even more severe in disturbed forests. 

Second, the authors’ results from two areas 
with species endemism were extrapolated 
to a quarter of the Brazilian Amazon, which 
spans an additional three unstudied areas of 
endemism. Their analysis therefore fails to 
fully capture the species differences between 
regions’. Small-ranged species will probably be 
more negatively affected than those with larger 
distributions. 

Finally, mammals were not studied, yet 
they have crucial roles in maintaining healthy 
ecosystems’’. Mammals would be expected 
to suffer at least as profoundly as the sampled 
taxa, owing to the severity of hunting in acces- 
sible forests located among farmlands or near 
roads"’. More generally, whether findings from 
the Brazilian Amazon can be extrapolated to 
other tropical regions is unknown. Barlow 
et al. have given scientists the impetus and 


COMPUTATIONAL NEUROSCIENCE 


methodological tools to make such assessments, 
making similar studies a research frontier else- 
where in the tropics. 

This research challenges the governance 
of tropical forests, and hence the protection 
of the myriad conservation and ecosystem 
benefits they provide, which sustain some of 
the most biodiverse ecosystems on the planet 
and the livelihoods of millions of people. As 
the authors acknowledge, avoiding deforesta- 
tion must remain a key tenet of conservation 
strategies’. However, their results underscore 

the need for a step- 


The authors’ change in forest gov- 
results ernance, with much 
Galecccore greater emphasis on 
theneed for a the ecological health 

on of retained forest’. 
step change In particular, Barlow 
inforest and colleagues have 
governance. shown that policies, 


such as Brazil’s Forest 
Code, that set targets for forest cover without 
also setting requirements for forest quality are 
insufficient to prevent substantial conserva- 
tion losses, and are a slippery slope to greatly 
impoverished ecosystems”. 

To remedy this, agricultural landscapes 
must be better designed to promote the pro- 
tection of larger and less-isolated forest blocks. 
More-stringent regulation and enforcement 
are needed, both of fire use in agriculture 
(which frequently spills into forests) and of 
selective logging — together with clear econ- 
omic benefits for more sustainably managed 
agriculture and sustainable logging’. These 
all require coordination between landowners, 


Species-specific 
motion detectors 


Arange of neuronal mechanisms can enable animals to detect the direction of 
visual motion. Computational models now indicate that a factor as simple as eye 
size might explain some of this diversity. SEE ARTICLE P.105 


THOMAS EULER & TOM BADEN 


eeing whether and where an object moves 

is crucial for the survival of any visually 

oriented animal, whether predator or 
prey. Consequently, motion and its direction 
are computed at many levels along the ver- 
tebrate visual pathway, starting in the retina. 
One key element of direction-selective retinal 
neuronal circuits is the starburst amacrine 
cell (SAC). On page 105, Ding et al.' unpick 
the mechanisms that mediate SAC direction 
selectivity in the mouse retina. 


Structures called dendrites project radially 
from the body of the SAC to give the cell its 
characteristic shape, which resembles an 
exploding star. The dendrites receive excita- 
tory inputs from the retina’s light-sensing 
photoreceptor cells via bipolar cells, and in 
turn make inhibitory synaptic connections 
to neurons called direction-selective ganglion 
cells (DSGCs) and other SACs. Different types 
of DSGC are each robustly tuned to movement 
in a particular (preferred) direction. 

The work of numerous labs over past 
decades has indicated that the inhibitory 


© 2016 Macmillan Publishers Limited. All rights reserved. 


NEWS & VIEWS | RESEARCH | 


policymakers and conservationists across 
entire landscapes and regions. 

In many tropical forest regions, disturbed 
primary forests that have seen big biodiversity 
losses are especially valuable for conservation, 
because large undisturbed forests are rare or 
completely lacking. Within such regions, these 
results underscore the necessity of assisting 
the recovery of disturbed forests or of taking 
unproductive farmland out of use to restore 
forest coverage and connectivity. Although 
the biodiversity extinction crisis could be even 
worse than currently recognized, by embracing 
better management strategies, the solutions are 
still within our grasp. m 


David P. Edwards is in the Department of 
Animal and Plant Sciences, University of 
Sheffield, Sheffield S10 2TN, UK. 

e-mail: david.edwards@sheffield.ac.uk 


1. Lewis, S. L., Edwards, D. P. & Galbraith, D. Science 
349, 827-832 (2015). 

ewbold, T. et a/. Nature 520, 45-50 (2015). 
epstad, D. et al. Science 326, 1350-1351 (2009). 
Barlow, J. et al. Nature 535, 144-147 (2016). 
Peres, C. A. Conserv. Biol. 14, 240-253 (2000). 
Edwards, D. P., Tobias, J. A., Sheil, D., Meijaard, E. 
& Laurance, W. F. Trends Ecol. Evol. 29, 511-520 
(2014). 

7. Barlow, J. & Peres, C. A. Ecol. Appl. 14, 1358-1373 
(2004). 

8. Ferraz, G. et al. Science 315, 238-241 (2007). 

9. Socolar, J. B., Gilroy, J. J., Kunin, W. E. & Edwards, 
D. P. Trends Ecol. Evol. 31, 67-80 (2016). 

10.Peres, C. A., Emilio, T., Schietti, J., Desmouliére, 
S.J. M. & Levi, T. Proc. Nat! Acad. Sci. USA 113, 
892-897 (2016). 

.Laurance, W. F., Goosem, M. & Laurance, S. G. W. 
Trends Ecol. Evol. 24, 659-669 (2009). 


AakWh 


1 


a 


This article was published online on 29 June 2016. 


signals sent by SACs to DSGCs contain 
information about movement direction. 
However, an SAC as a whole is not selective for 
one particular direction; instead, each dendrite 
is tuned to the direction of movement that is 
aligned with the direction from the cell body to 
the dendritic tip’. In addition, SAC dendrites 
tuned to a particular direction make more syn- 
aptic connections with DSGCs that prefer the 
opposite direction than with those of the same 
preference, providing DSGCs with informa- 
tion that defines their tuning’. 

Although this general layout is broadly 
accepted, the mechanism that renders SAC 
synaptic outputs direction-selective is still 
intensely debated. In mammals, several (not 
necessarily mutually exclusive) mechanisms 
have been proposed. Some rely on proper- 
ties of the dendrites — for instance, the spa- 
tial arrangement of channel proteins in the 
membrane or of a chloride gradient along the 
dendrite. Others invoke network interactions 
such as reciprocal inhibition between SACs or 
a particular spatial arrangement of bipolar-cell 
inputs that signal to the neuron with different 
timings (reviewed in refs 4 and 5). But the rela- 
tive contribution of each mechanism is unclear. 


7 JULY 2016 | VOL 535 | NATURE | 45 


| RESEARCH | NEWS & VIEWS 


a 
Rabbit 


t 


Mouse 


SAC cell 
body 


Figure 1 | Direction is in the eye of the beholder. a, Rabbits have larger eyes than mice. Therefore, 

the image of an object travelling a given distance (grey arrows) will traverse different distances across 
the retina of each species (green arrows). Ding et al.' hypothesize that neuronal circuits in mice thus 
need to respond to lower retinal-image velocities to compute information about movement direction. 

b, The authors find that the synaptic connections to direction-selective starburst amacrine cells (SACs) 
in the retina differ between the two species. As shown in this simplified schematic, inputs (both from 
other SACs and from bipolar cells) cover the length of the cells’ radially projecting dendrites in rabbits, 
whereas inputs and outputs are well segregated along the dendrites in mice. This difference increases the 
directional tuning of mouse SAC dendrites at slower retinal velocities. 


Ding et al. investigated the role of network 
interactions in SAC direction selectivity in 
mice. They used a large electron-microscopy 
data set to generate a map of all the synaptic 
connections to and from SACs. This allowed 
them to precisely assess the spatial arrange- 
ment of both input and output synapses along 
SAC dendrites. Synapses followed a clear pat- 
tern: inputs (both excitatory and inhibitory) 
were restricted to the proximal section of 
the dendrites, whereas output synapses were 
located at the dendritic tips (Fig. 1). 

Although this output arrangement comes as 
no surprise, the proximal restriction of inhibi- 
tory inputs is in stark contrast to the situation 
seen in rabbit SACs, which show much less 
spatial segregation®. This is puzzling. Why 
should two ground-dwelling mammals that 
live in similar environments use different solu- 
tions to compute motion direction? 

To address this question, Ding and colleagues 
used a computer model to simulate how 
different synaptic arrangements might affect 
SAC direction selectivity. They connected a 
single SAC to bipolar cells and neighbouring 
SACs. They then varied the arrangement of 
input synapses along the central SAC’s den- 
drites to mimic the mouse and rabbit ‘solu- 
tions, and compared the cells’ performance. 
The higher degree of segregation in mice 
generated more-robust directional tuning, in 
particular for stimuli that traversed the retinal 
surface slowly. The authors then went on to 
confirm these predictions experimentally by 
recording motion-evoked signals in mouse 
SAC dendrites. 

Why would mice need to compute slower 
movements than rabbits? After all, the veloci- 
ties of movement that these animals encounter 
in the wild are probably similar. The answer 


46 | NATURE | VOL 535 | 7 JULY 2016 


is deceptively simple. Mice have smaller eyes 
than rabbits. A moving object's angular velocity 
(velocity measured as the angle travelled per 
unit of time instead of distance per unit of time) 
translates to a lower absolute velocity across the 
retinal surface of a smaller eye than that for a 
larger eye. Therefore, the object’s image moves 
substantially more slowly on the mouse’ reti- 
nal surface (Fig. 1). Perhaps the less-segregated 
synaptic arrangement in the rabbit circuit is 
good enough to encode the relevant range of 
stimulus velocities, whereas mice needed to 
evolve a solution that also reliably works for 
slower movements across the retina. 

This is a neat study for several reasons. First, 
it represents a well-balanced synthesis of large- 
scale, high-resolution circuit anatomy, realistic 
modelling and synaptic neurophysiology. 
Second, demonstrating that a simple differ- 
ence such as eye size can have a direct impact 
on how circuits and computations are imple- 
mented highlights the often-underestimated 
importance of taking species differences into 
account. The suggestion that different species 
may use different adaptations of a general com- 
putation for retinal direction selectivity could 
be the key to reconciling seemingly contradic- 
tory findings in the field. 

Third, this study takes a crucial step towards 
the development of a truly integrated model 
of direction-selective retinal circuitry. A care- 
fully extended version of the model designed 
by Ding and co-workers (for instance, one 
that includes more-realistic bipolar inputs) 
could be instrumental in disentangling dif- 
ferent direction-selectivity mechanisms. In 
addition, this approach will allow research- 
ers to systematically address other mysterious 
aspects of retinal direction detection, such as 
the role of the molecule acetylcholine, which is 


© 2016 Macmillan Publishers Limited. All rights reserved. 


SACs’ secondary neurotransmitter and seems 
to play a part in signalling only under certain 
stimulus conditions”. 

Retinal direction-selective circuits should 
now be studied in other species, perhaps start- 
ing with those that have extreme eye sizes or 
more-distant evolutionary roots. For instance, 
the DSGCs of zebrafish larvae have largely sim- 
ilar properties to those of mammalian DSGCs’, 
suggesting a similar organization of direction- 
selective circuits. However, in the tiny larval 
eye, an object moving at an angular velocity of 
1 degree per second crosses the retinal surface 
at a mere 3 micrometres per second — 10 times 
more slowly than in mice. Maybe zebrafish 
have an even more precise synaptic arrange- 
ment along SAC dendrites. Or, perhaps more 
likely, they have another direction-detection 
mechanism altogether. 

Primate and rabbit eyes are not that different 
in size. However, direction selectivity in the 
primate retina is a puzzle. SACs are present, 
but are sparser than in any other mammal 
studied'’. Despite long-standing and vigor- 
ous attempts, there is no direct evidence so 
far that primates have DSGCs (discussed in 
ref. 11). Instead, primates seem to generate 
direction-selective responses farther along the 
visual pathway. 

What is the take-home message? Maybe, 
that the goal of neuroscience is not to ‘solve’ 
the mouse, rabbit or zebrafish. Instead, neuro- 
scientists should collect different solutions to 
common computational problems. Which 
solutions are actually implemented in any 
particular instance is perhaps secondary. After 
all, neuroscience is about building an under- 
standing of the general principles by which 
neurons and networks generate function. = 


Thomas Euler and Tom Baden are at the 
Institute for Ophthalmic Research, University 
of Tiibingen, 72076 Tiibingen, Germany. 
T.E. is also in the Center for Integrative 
Neuroscience, University of Tiibingen. 

T.B. is also at the School of Life Sciences, 
University of Sussex, Brighton, UK. 

e-mails: thomas@eulerlab.de; 
tom@badenlab.org 


1. Ding, H., Smith, R. G., Poleg-Polsky, A., Diamond, J. S. 
& Briggman, K. L. Nature 535, 105-110 (2016). 
2. Euler, T., Detwiler, P. B. & Denk, W. Nature 418, 
845-852 (2002). 
3. Briggman, K. L., Helmstaedter, M. & Denk, W. Nature 
471, 183-188 (2011). 
4. Vaney, D.I., Sivyer, B. & Taylor, W. R. Nature Rev. 
Neurosci. 13, 194-208 (2012). 
. Kim, J. S. et al. Nature 509, 331-336 (2014). 
. Famiglietti, E. V. J. Comp. Neurol. 309, 40-70 
(1991). 
7. Grzywacz, N. M. & Amthor, F. R. Vis. Neurosci. 24, 
647-661 (2007). 
8. Lee, S., Kim, K. & Zhou, Z. J. Neuron 68, 1159-1172 
(2010). 
9. Gebhardt, C., Baier, H. & Del Bene, F. Front. Neural 
Circuits 7, 111 (2013). 
10.Rodieck, R. W. J. Comp. Neurol. 285, 18-37 (1989). 
11.Borst, A. & Euler, T. Neuron 71, 974-994 (2011). 


ao 


This article was published online on 22 June 2016. 


NatuLreiNSIGHT 


INTESTINAL MICROBIOTA IN HEALTH AND DISEASE 


Nature iNsIGHT 


INTESTINAL MIC 310° 
IN HEALTH AND 


-.. 
tee PP . 
” 
- 


Cover illustration 
Jessica Fortner 


Editor, Nature 
Philip Campbell 


Publishing 
Richard Hughes 


Insights Editor 
Ursula Weiss 


Production Editor 
Elizabeth Batty 


Art Editor 
Nik Spencer 


Sponsorship 
Yuki Fujiwara 
Yvette Smith 
Production 
lan Pope 


Marketing 
Nicole Jackson 
Alan Abery 
Aiko Shuzui 


Editorial Assistant 
Giacomo Russo 


The Campus 
4 Crinan Street 
London N1 9XW, UK 


Tel: +44 (0) 20 7833 4000 


e: nature@nature.com 


SPRINGER 
NATURE 


7 July 2016 / Vol 535 / Issue No 7610 


he human gut is home to trillions of 

microorganisms, which modulate health and 

disease. This Insight brings together leaders in 
the field of microbiota—host interactions to provide an 
overview of basic biological processes and important 
advances in the development of clinical applications. 

Jeff Gordon and colleagues present a microbial 
perspective of human developmental biology. They 
describe how the microbiota affects prenatal and 
postnatal growth and explain how an understanding 
of such communities could help to prevent and treat 
diseases. To this end, they call for the establishment 
of ‘human microbial observatories’ to examine the 
development of the microbiota in birth cohorts with 
diverse lifestyles and patterns of disease. 

Justin Sonnenburg and Fredrik Backhed analyse how 
the microbiota and diet interact to influence metabolism. 
They review mechanisms used by the microbiota to 
modulate the effects of diet on the host's metabolic status, 
as well as the potential for therapeutic intervention. 

Eran Elinav and colleagues discuss crosstalk between 
the microbiota and the innate immune system, focusing 
on bacterial components and host response pathways, 
mutually beneficial effects of such communication and 
diseases that arise when this interaction is disturbed. 

Kenya Honda and Dan Littman summarize our 
understanding of how specific microbes determine 
aspects of adaptive immunity and the part that they 
play in the induction of both immune tolerance and 
conditions such as allergy and intestinal inflammation. 

Andreas Baumler and Vanessa Sperandio examine 
interactions between the gut microbiota and pathogenic 
bacteria, including how pathogenic species exploit 
microbiota-derived sources of carbon and nitrogen as 
nutrients and regulatory signals for growth and virulence. 

And Rob Knight and colleagues consider the advent of 
microbiome-wide association studies, which have been 
enabled by advances in DNA sequencing, metabolomics, 
proteomics and computation. They provide a road map 
for realizing the promise of microbiome-based precision 
diagnostics and therapies. 

Nature is pleased to acknowledge the financial support of 
Yakult Honsha Co., Ltd in producing this Insight. As always, 
Nature carries sole responsibility for all editorial content. 


Christina Tobin Kahrstrém, Nonia Pariente & Ursula Weiss 
Senior Editors 


© 2016 Macmillan Publishers Limited. All rights reserved. 


CONTENTS 


PERSPECTIVE 

48 Amicrobial perspective of human 
developmental biology 
Mark R. Charbonneau, Laura V. Blanton, 
Daniel B. DiGiulio, David A. Relman, 
Carlito B. Lebrilla, David A. Mills 
& Jeffrey |. Gordon 


REVIEWS 

56 Diet-microbiota interactions as 
moderators of human metabolism 
Justin L. Sonnenburg & Fredrik Backhed 


# 


65 The microbiome and innate immunity 
Christoph A. Thaiss, Niv Zmora, 
Maayan Levy & Eran Elinav 


75 The microbiota in adaptive immune 
homeostasis and disease 
Kenya Honda & Dan R. Littman 


85 Interactions between the microbiota 
and pathogenic bacteria in the gut 
Andreas J. Baumler & Vanessa Sperandio 


94 Microbiome-wide association studies 
link dynamic microbial consortia to 
disease 
Jack A. Gilbert, Robert A. Quinn, Justine 
Debelius, Zhenjiang Z. Xu, James Morton, 
Neha Garg, Janet K. Jansson, Pieter C. 
Dorrestein & Rob Knight 


7 JULY 2016 | VOL 535 | NATURE | 47 


PERSPECTIVE 


doi:10.1038/nature18845 


A microbial perspective of human 
developmental biology 


Mark R. Charbonneau?”, Laura V. Blanton!’, Daniel B. DiGiulio®*, David A. Relman*", Carlito B. Lebrilla®’, 


David A. Mills”*” & Jeffrey I. Gordon’* 


When most people think of human development, they tend to consider only human cells and organs. Yet there is another 
facet that involves human-associated microbial communities. A microbial perspective of human development provides 
opportunities to refine our definitions of healthy prenatal and postnatal growth and to develop innovative strategies for 
disease prevention and treatment. Given the dramatic changes in lifestyles and disease patterns that are occurring with 
globalization, we issue a call for the establishment of ‘human microbial observatories’ designed to examine microbial 
community development in birth cohorts representing populations with diverse anthropological characteristics, includ- 


ing those undergoing rapid change. 


survey of the biological landscape that encompasses human 

development should consider all facets of what it means to 

be ‘human. There are at least as many microbial cells as there 
are human cells in our bodies, and the vast majority of unique genes 
are microbial!*. As such, we can view ourselves as holobionts*. The 
dynamic microbe-microbe and microbe-host interactions that allow 
our microbial communities to assemble and endure are as yet largely 
uncharacterized. Our relationships with microbes begin before birth; 
they represent potentially modifiable features of postnatal development, 
and probably contribute to intra- and interpersonal variations in many 
aspects of normal physiology, metabolism, immunity and neurology, 
as well as to predisposition to diseases. 

The past decade has produced a magnificent and still rapidly evolv- 
ing toolbox of experimental and computational techniques for culture- 
independent identification of the microorganisms that comprise our 
body habitat-associated microbial communities (microbiota), as well 
as their genes (microbiome) and gene products. These tools allow a 
number of hypotheses about microbial contributions to human devel- 
opment to be tested. One hypothesis is that maternal microbial ecology 
affects pregnancy, fetal development and the future health of offspring. 
If true, the hypothesis suggests the possibility of prenatal prognostic 
and diagnostic measurements and therapeutic interventions that target 
the maternal microbiota to guide healthy fetal development and avoid 
premature birth and other negative outcomes. Another hypothesis is 
that after birth, there are microbial taxa whose changing patterns of 
representation can be used to define ‘normal’ programmes of develop- 
ment of the microbial communities that occupy a given body habitat in 
biologically unrelated individuals with healthy growth phenotypes (as 
defined by anthropometric indices). A corollary to this hypothesis is 
that deviations from these normal programmes of community assem- 
bly represent a way to characterize abnormal development, including 
states of immaturity or precocious maturation. Establishing a causal 
relationship between the state of microbial community development 
and healthy growth would allow deviations from normal microbiota 
development to be used as a parameter for risk assessment or classifica- 
tion of a number of diseases that may manifest themselves early or later 


in life, yield insights about disease pathogenesis, and provide a starting 
point for developing microbiota-directed therapeutic interventions or 
new approaches for disease prevention. 

In this Perspective, we discuss evolving concepts about the relation- 
ship between maternal microbial ecology (before, during and after preg- 
nancy) and pregnancy outcomes as well as the relationship between 
human breast milk oligosaccharides, the establishment and expressed 
functions of the gut microbiota and healthy postnatal growth. We also 
address the need for long-term birth cohort studies to identify both 
shared and distinctive features of microbial community development, 
within and across populations, and delineate how normal execution 
(and perturbations) of this facet of human developmental biology is 
related to health status. 


Maternal microbial ecology 

The structure and function of maternal microbial communities, and the 
impact of these communities on maternal and infant health outcomes 
has been considered in several body habitats, including the vagina, the 
distal gut and the mouth. 


Vaginal microbiota 

For decades, culture-based studies have suggested that lactobacilli are 
the most prevalent constituents of the vaginal microbiota in non-preg- 
nant and pregnant women’. More recently, culture-independent stud- 
ies have demonstrated that most vaginal communities are dominated 
numerically by a single Lactobacillus species. This finding has prompted 
some investigators to assign vaginal communities to a relatively lim- 
ited number of discrete ‘community state types’ (CSTs). These CSTs are 
classified either by which Lactobacillus species is dominant (CST I, IL, 
III and V) or by the presence of a relatively diverse, Lactobacillus-poor 
community (CST IV)°. The resolution and veracity of the vaginal CST 
model remains unsettled: some investigators have proposed other stable 
or transitional states beyond the five described initially’. Others have 
highlighted potential pitfalls, including the extent to which the detec- 
tion of state types is dependent on the analytical workflow’. Irrespec- 
tive of the ultimate usefulness of the CST model, the limited diversity 


‘Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St Louis, Missouri 63110, USA. @Center for Gut Microbiome and Nutrition Research, Washington 
University School of Medicine, St Louis, Missouri 63110, USA. *Department of Medicine, Stanford University, Stanford, California 94305, USA. “VA Palo Alto Health Care System, Palo Alto, California 
94304, USA. °Department of Microbiology and Immunology, Stanford University, Stanford, California 94305, USA. Department of Chemistry, University of California, Davis, Davis, California 95616, 
USA. ’Foods for Health Institute, University of California, Davis, Davis, California 95616, USA. ®Department of Food Science and Technology, University of California at Davis, Davis, California 95616, 
USA. *Department of Viticulture and Enology, University of California, Davis, Davis, California 95616, USA. 


48 | NATURE | VOL 535 | 7 JULY 2016 


© 2016 Macmillan Publishers Limited. All rights reserved. 


of abundant taxa in vaginal communities suggests that a deterministic 
process of community assembly, such as habitat filtering, governs the 
overall structure of the adult vaginal microbiota. 

CST IV is similar to the microbiota structure encountered in bacterial 
vaginosis, a dysbiosis that is associated with adverse health outcomes, 
including preterm birth””’. In non-pregnant North American women, 
the prevalence of CSTs varies with self-reported race and ethnicity. 
CST IV is observed in about 40% of African American and Hispanic 
women, but only about 20% of Asian American women and about 10% 
of Caucasian women’. This skewed distribution suggests that a diverse, 
non-Lactobacillus-dominated community (CST IV) might represent a 
normal variant in a subset of women and argues for an expanded assess- 
ment of what comprises a healthy vaginal microbiota. 

Little is known about the development of the vaginal microbiota 
before and after puberty, or how different vaginal community ‘fates’ 
(structural and functional states) in adulthood are determined. One 
area that should be investigated is the relationship between the gly- 
can content of the vaginal mucosa and the community state, including 
the biogeographical features of each state. In addition, much remains 
to be learned about the effects of bacterial and eukaryotic taxa (and 
the viruses they host) on vaginal epithelial-cell differentiation, vaginal 
mucosal metabolism and the activities of components of the innate and 
adaptive arms of the immune system that are represented in this habi- 
tat. The development of microarrays composed of purified microbial 
glycans” provides one way to characterize immunological responses to 
the bacterial antigens represented in the vaginal microbiota and thus 
creates another approach to classify community states. Representative 
preclinical models are needed for testing whether causal relationships 
exist between these and other environmental factors and community 
states. They will also help to characterize the mechanisms that shape 
community assembly, that determine community responses to various 
perturbations, that underlie community resiliency and that mediate the 
effects of community states on host biology. 

A compelling question is whether there is a discernable programme 
of change in the properties of the vaginal microbiota before, during 
and after pregnancy and, if there is, the extent to which such change 
recapitulates features of the original developmental biology of the com- 
munity. A related question is whether and how functional alterations in 
vaginal microbial community states and in the microbiota at other body 
sites during pregnancy affect intrauterine growth of the fetus (Box 1). 
Work has focused on bacterial taxonomic composition of these com- 
munities rather than on the functional features they express. Studies 
currently suggest that the bacterial composition of the microbiota is 
more stable during pregnancy than at other times during adulthood’*”*. 
The diverse CST IV seems to be the least stable community state dur- 
ing pregnancy: it exhibits a substantially higher rate of transition to 
alternative CSTs on a week-to-week timescale than do the four Lacto- 
bacillus-dominated CSTs™. A note of caution is that vaginal microbial 
community composition has not been defined in time-series studies 
in which samples are taken from the same women before conception, 
during and after pregnancy, and during subsequent pregnancies. In 
addition, little is known about the non-bacterial membership of the 
pregnancy-associated microbiota. 

Some factors are thought to promote vaginal microbiota structural 
stability during pregnancy, such as a lack of menses. However, many 
factors remain unknown, as is the degree to which structural stability 
is accompanied by functional stability and how this relates to the trans- 
fer of taxa from mothers to infants during the immediate postpartum 
period. From an anthropological perspective, it is interesting to note 
that prescribed diets during pregnancy are an important part of the cul- 
tural traditions of some populations’*. It is unclear how these treatments 
affect vaginal (and gut) microbial community structure, function and 
stability. The answers to these questions could yield fresh approaches 
for deliberate manipulation of the vaginal microbiota. 

In contrast to the structural stability seen during pregnancy, studies of 
women in the United States, Europe, Africa and Asia have shown that, 


PERSPECTIVE INSIGHT | 


BOX1 
The evolving maternal- 
fetal microbial landscape 


Much remains to be learned about the assembly, host interactions 
and transmission of maternal and early childhood microbial 
communities. Hypotheses about the evolving maternal-fetal 
microbial landscape deserve further attention and testing. 


@ Some activities of the maternal microbiota might have a beneficial 
impact on fetal nutrition and development. 

@ Altered compositions and expressed activities of the maternal 
microbiota might contribute to gestational outcomes, including 
adverse events such as premature labour and birth. 

@ Microbes that are transferred to offspring before or during delivery 
might reflect environmental exposures of the mother during 
pregnancy (for example, diet). 

@ Persisting disturbances in the vaginal microbiota after giving birth 
might pose a risk for preterm delivery in subsequent pregnancies. 

@ Variations in the transfer of microbes from mothers to infants 
might affect early postnatal development of the child’s microbiota, 
immune system and metabolic processes. 


after delivery, the vaginal microbiota commonly undergoes an abrupt 
and striking alteration in its taxonomic composition'*"””. This altera- 
tion is characterized by a significant increase in within-community 
(or «) diversity and is driven by a decrease in the abundance of Lactoba- 
cillus species and a commensurate increase in a wide range of anaerobic 
species. Although many features of altered postpartum microbial com- 
munities remain to be elucidated (such as the time it takes to return to 
the ‘baseline’ state), it seems that they can persist for at least 1 year in 
many women”. A short interval (less than 12 months) between preg- 
nancies is associated with an increased risk of preterm birth; whether 
a persisting altered postpartum vaginal community contributes to this 
risk warrants further study. 


Gut microbiota 

Much more information is needed about whether the structural and 
functional properties of the gut microbiota of women change as a func- 
tion of pregnancy. If changes do occur, whether and how they relate to 
maternal and fetal health, as well as the subsequent health of infants and 
children, should be investigated. The relationship between maternal 
nutritional status at the time of conception and the health of the new- 
born is well established”. A study of pregnant Finnish women reported 
a significant increase in faecal energy content, as determined by bomb 
calorimetry, between the first and third trimesters despite stable diets 
and energy intake”’. This change in energy content correlated with shifts 
in taxonomic composition”. However, studies of women residing in 
the United States’ and in Tanzania’’, which were conducted at higher 
temporal resolution, found that the women’s faecal microbiota mani- 
fested compositional stability throughout pregnancy (as measured by 
trends of a diversity, week-to-week variation in bacterial composition 
within subjects and f diversity across gestational time). The reasons 
for these divergent findings are unclear. The maternal microbiota and 
diet also have the potential to influence both fetal and maternal epig- 
enomes, although a discussion of this topic is beyond the scope of this 
Perspective. 


Oral microbiota 

Mothers harbour complex microbial communities in their mouths. The 
composition” and transcriptional activities* of these communities are 
altered in the setting of periodontitis, a condition also associated with 


7 JULY 2016 | VOL 535 | NATURE | 49 


© 2016 Macmillan Publishers Limited. All rights reserved. 


INSIGHT | PERSPECTIVE 


intrauterine growth restriction, preterm birth and low birthweight™. A 
study of the oral microbiota of women living in the United States and 
Africa indicated that the taxonomic composition remains stable during 
pregnancy'*””. However, pre-conception data from the same women 
were unavailable for comparison. Microbial taxa have been detected 
in amniotic fluid” and in the placenta” that probably originate from 
the mouth, particularly in women who are unhealthy or had adverse 
outcomes such as preterm labour with intact fetal (chorioamniotic) 
membranes or premature rupture of these membranes. Disentangling 
adverse effects on pregnancy that originate from the oral microbiota is 
challenging, especially if disease results from a perturbation in relatively 
minor constituents of the community”. 

Development of the oral microbiota has not been comprehensively 
defined through time-series studies of healthy infants and children. For 
example, the effects of maternal prenatal history, gestational age, route of 
delivery and milk-feeding history remain to be characterized. One study 
attempted to define ‘normal’ development by following 50 children from 
the ages of 4 years to 6 years”’. A strong effect of chronological age was 
observed on the taxonomic composition of the oral microbiota. This 
effect was more pronounced for bacterial communities in supragingival 
plaque than in saliva, which suggests body-habitat-specific differences 
in community assembly programmes. Deviations from early, normal 
community compositions were predictive of subsequent development 
of dental caries”. 


Preterm delivery and fetal exposure to microbes 

The extent to which the fetal environment is sterile has been pon- 
dered since the birth of the field of microbiology”. Early studies sug- 
gested that the amniotic cavity was universally sterile before labour”, 
although subsequent, indirect evidence has challenged that assump- 
tion™. Culture-based and later, polymerase-chain-reaction-based 
studies indicated that microbial invasion of the amniotic cavity occurs 
more frequently and involves a greater diversity of microbes than was 
originally thought**”*. Endometrial sampling of the intrauterine cav- 
ity in non-pregnant women has yielded widely varying rates (0-89%) 
of microbial recovery across culture-based studies*’. Molecular-based 
studies suggest that most uteruses harbour microbes, with Lactobacillus, 
Prevotella and Bacteroides among the genera that are most commonly 
encountered****. However, data obtained during pregnancy are lacking. 

At the time of delivery, the basal plate of the placenta contains intra- 
cellular bacteria in about a quarter of women, but in about half of those 
who deliver spontaneously before 28 weeks of pregnancy”. One study 
has shown that the placenta harbours a complex set of microbial DNA 
sequences”. But unlike more densely colonized body sites such as the 
gut and mouth, placental samples are overwhelmingly negative in cul- 
ture-based assays*’. DNA-based assessments of potential microbes in 
the placenta, and other low microbial biomass sites, are particularly 
prone to confounding findings from ‘background’ DNA*™”' and should 
be interpreted with caution in the absence of appropriate controls. The 
degree to which the fetal-placental environment has evolved to serve as 
a venue for programmed engagement of diverse microbes, as opposed 
to being a site that simply tolerates stochastic low-level microbial expo- 
sures, remains unclear and merits further study. 

A report published this year suggests that in women who experi- 
enced spontaneous preterm birth, those with histological evidence 
of severe chorioamnionitis have fewer species of bacteria on the fetal 
side of the placental membrane than do those who do not have severe 
chorioamnionitis”. This difference might be driven bya high abun- 
dance of a limited number of clonal pathogens (as is typical of many 
clinical infections) in women with severe chorioamnionitis. Further 
studies with appropriate negative controls are needed to corroborate 
these findings and to resolve unanswered questions such as the body site 
of origin of the detected microbes, as well as the direction and timing of 
their translocation across adjacent tissues”. 

Microbes have been detected in first-pass meconium samples from 
approximately two-thirds of healthy, vaginally delivered, breastfed 


50 | NATURE | VOL 535 | 7 JULY 2016 


full-term babies, but at very low levels**. Detection is more common in 
meconium from neonates who are born before 33 weeks of gestation, 
and there is considerable taxonomic overlap with the microbes found in 
the amniotic fluid”. Molecular evidence for microbial invasion of the 
amniotic cavity has provided associations of space, time and ‘dose’ that 
support a causal relationship with preterm birth”. Microbial taxa that 
are associated with preterm birth most frequently originate from the 
mother and exploit one of three natural routes for invading the amniotic 
cavity**: ascension from the vagina and cervix; transfer through the fal- 
lopian tubes; or translocation from more distant sites of colonization 
in the body, presumably through the bloodstream”. The majority of 
invading microbes seem to come from the vagina””*“*, although other 
body habitats, most notably the mouth” and gut”, may havea role in 
some cases. Taxa associated with CST IV communities, such as Urea- 
plasma and Prevotella species, are among the more common invaders. 
By contrast, Lactobacillus species are rarely encountered in amniotic 
fluid, even after membrane rupture”. This suggests that features of 
specific microbial taxa, or groups of taxa that occur together in CST IV 
communities, underpin factors that promote invasion of the amniotic 
cavity, such as virulence genes and divergent host immune responses”. 
Whether particular vaginal CSTs or the presence and abundance of indi- 
vidual taxa are associated with preterm birth is an unresolved question 
of great interest. Studies have produced conflicting results’*™*. If vaginal 
CST IV communities are indeed associated with preterm birth in some 
women, this would be broadly consistent with epidemiological evidence 
that links bacterial vaginosis, which shares taxonomic similarity with 
CST IV communities, to an increased risk of preterm birth’. 

The effect of preterm delivery on the development of microbial 
communities in premature babies has been examined mainly from 
the perspective of the infant. Comparing the development of micro- 
bial communities in premature and full-term infants could lead to 
amended or new definitions of biological immaturity. Such definitions 
are confounded, however, by the frequent pre-emptive administration 
of antibiotics to babies who are born prematurely. Maternal microbial 
communities may also exert a significant influence. An elegant study 
demonstrated that transient microbial colonization of pregnant, germ- 
free mice was sufficient to modulate the function of innate immune cells 
in the small intestines of their germ-free offspring”. Microbial prod- 
ucts were detected in both the dam's milk and placenta, which suggests 
that ‘indirect’ exposure to microbes through the mother is sufficient 
to shape neonatal development. Such findings suggest that systematic 
characterization of multiple body-habitat-associated microbial commu- 
nities in mothers who have preterm versus full-term pregnancies creates 
opportunities to examine whether there are identifiable programmes of 
change in maternal microbial ecology during pregnancy and whether 
disruption of these programmes affects initial transfer of microbes to 
their offspring (and the subsequent development of the children’s micro- 
biota). This knowledge could change clinical practice so that more atten- 
tion is placed on careful stewardship of microbial resources in women 
who have a high risk of preterm delivery”. Deliberate efforts could be 
made to transfer these microbes to their offspring, with the potential for 
supplementation with important taxa that are missing. 


Breast milk and infant gut microbiota 

Researchers are beginning to uncover how breast milk composition 
changes over time after parturition and how it shapes the structural 
and functional maturation of infant-associated microbial communities. 


Microbes associated with breast milk 

Studies of milk-associated microbiota reveal highly individualized 
assemblages”’. These groups of microbes are routinely dominated by 
skin-associated bacteria, such as staphylococci and streptococci, which 
generally do not persist in the infant gut in significant numbers for more 
than a few weeks”. Some anaerobic species, such as Bifidobacterium, 
have been isolated from breast milk, which suggests a route for transit 
of specific strains that eventually colonize the infant colon. The factors 


© 2016 Macmillan Publishers Limited. All rights reserved. 


Secretor Linkage position 


ay] 

39 
Monosaccharide symbols 
@ Glucose (Glc) 

yee Ley ® Galactose (Gal) 


Non-secretor 


Figure 1 | Oligosaccharides in human breast milk and strategies for 

their degradation by the infant microbiota. a, HMOs that are most 
abundant in the breast milk of mothers who are secretors are indicated by 
the blue arrow; those that are most abundant in the breast milk of non- 
secretors are indicated by the red arrow. Structures at the intersection of 

the arrows are found in both secretor and non-secretor mothers in similar 
abundances. Monosaccharides in HMOs, as well as their glycosidic linkages, 
are described by the inset key. b, Most strains of Bifidobacterium (left) 

use an ‘internalize, then degrade’ strategy in which HMO structures are 


A Fucose (Fuc) 


¢ N Aeetyineutemlale 
acid (Neu5Ac) 


Bs Cane 


that contribute to the strain-specific composition in breast milk are 
unclear and are subject to debate. 


Human milk oligosaccharides 

From a molecular perspective, breast milk is the best-characterized 
food that humans consume. The most abundant component in dried 
samples of breast milk is lactose, which provides nutrition for the infant, 
although many bacterial taxa can also digest this disaccharide. Lactose 
is made available specifically to bacterial colonizers of the infant gut by 
extending it by 3-20 monosaccharide units to yield structures that are 
known collectively as human milk oligosaccharides (HMOs)**™. All 
HMOs contain this lactose core together with various combinations 
of glucose, galactose, N-acetyl galactosamine, fucose and sialic acid 
(N-acetylneuraminic acid or Neu5Ac)**. HMOsare often terminated 
by fucose or sialic acid (Fig. 1a). Approximately 60% of HMO structures 
are fucosylated, and 5-20% are sialylated”*”. 

The role of HMOs has become more apparent through the appli- 
cation of nanoflow liquid chromatography mass spectrometry. This 
method has detected more than 300 HMO structures in breast milk 
samples pooled from several mothers, with the concentrations of these 
structures spanning four orders of magnitude. The number of HMO 
structures found in the milk ofa particular mother is often more than 
100, although the profile of the structures varies between mothers**. 
HMOs contain varying amounts of Lewis-blood-group antigens Le’, 
Le’, Le* and Le’ (ref. 58). Individuals who produce Le? epitopes (a(1,2)- 
fucosylated structures) in their secretions, due to the presence of an 
active fucosyltransferase 2 (FUT2) gene, are known as secretors™. Secre- 
tors tend to have higher amounts of HMOs than do non-secretors (as 
much as 20% more). They also produce higher levels of fucosylated 
structures (nearly twofold more). However, non-secretors often have 
higher levels of sialylated structures” (Fig. 1a). The percentage of non- 
secretors varies geographically: they comprise about 20% of the popula- 
tion in Europe and up to 40% in West Africa”. 

Whether the HMO profiles of breast milk change as a function of 
time after delivery, and how differences in HMO composition relate to 
the development of the gut microbiota and healthy growth of the infant 
are important unanswered questions. Nanoflow liquid chromatography 
mass spectrometry has only been applied to HMO profiling during the 
past several years and the assay imposes constraints on throughput”. 
Limited information is therefore available about how specific HMOs 
change with time in healthy mothers, and whether consistent differ- 
ences exist in the HMO profiles across groups of women representing 


Internalize, then degrade 


m@ N-Acetylglucosamine (GIcNAc) 


PERSPECTIVE INSIGHT | 


External degradation 
Cross-feeding of 


bacteria 
wo Ca 
Monosaccharides 


A 
oe? 
Sus homologue 


ABC 
transporter 


Glycoside 
hydrolase 


Bacteroides 


Bifidobacterium 


first imported using ABC transporters and then degraded by intracellular 
glycoside hydrolases. Strains of Bacteroides (right) typically employ an 
‘external degradation’ strategy for HMO structures, which involves 
cell-surface-associated carbohydrate-binding proteins and secreted 
glycoside hydrolases that are encoded by polysaccharide utilization loci 
(PULs). These PULs have features similar the prototypic starch utilization 
system (Sus) of Bacteroides thetaiotaomicron. This external degradation 
can result in ‘cross-feeding’ of secondary consumers, including potentially 
pathogenic bacteria, in the infant gut microbiota. 


different ages, parities, geographic locations, nutritional states, culinary 
traditions and socio-economic statuses. The general trend during lac- 
tation is a decrease in levels of total HMOs as the mother progresses 
from the production of colostrum to mature milk, with the largest drop 
occurring in the first month postpartum”. However, the total amount 
of milk that is delivered as colostrum is quite small compared with the 
volume of mature milk (matching the size of the infant’s stomach and 
intestine). Therefore, throughout lactation, the amount of each class 
of HMO — and even each specific HMO structure — provided to the 
infant remains relatively constant. 

Giving birth prematurely can significantly affect the profile ofa moth- 
er’s HMO structures™. HMO profiles cannot yet be predicted in these 
mothers. Many mothers who deliver preterm have fucosylated HMOs 
that are as low as 20-40% of total HMOs, but some have levels of greater 
than 60%. This discrepancy is not corrected over time. 

A study published this year demonstrated that the HMO content 
of Malawian mothers’ milk correlates with infant growth outcomes”. 
Breast milk samples collected at 6 months postpartum were divided 
into two groups: those from mothers whose infants exhibited healthy 
growth at the time of collection (as defined by anthropometry), and 
those from mothers whose offspring exhibited severe stunting. Liquid 
chromatography-time-of-flight mass spectrometry revealed that 
mothers of infants with stunted growth had significantly lower con- 
centrations of total, sialylated and fucosylated HMOs, with the most 
growth-discriminatory sialylated HMO being sialyllacto-N-tetraose b, 
and the most discriminatory fucosylated HMOs being 2’-fucosyllactose 
and lacto-N-fucopentaose I. 

Sialic acids constitute a group of nine-carbon monosaccharides 
that are derived from neuraminic acid, and include Neu5Ac. UDP-N- 
acetylglucosamine-2-epimerase, which is the rate-limiting enzyme in 
the biosynthesis of sialic acid, is produced at low levels in the livers of 
infants”. Breast milk is therefore an important source of these sugars. 
The availability of sialic acids affects many organs, including the brain, 
where Neu5Ac is a component of gangliosides and is covalently linked 
to neural cell-adhesion molecules (NCAMs) that mediate cell-cell inter- 
actions involved in synaptogenesis and memory”. Supplementation 
of the diet with sialylated glycoproteins and sialyllactose increases the 
polysialylation of NCAM and sialyated gangliosides, with some reports 
showing improved memory in animal models”. A preclinical model 
has demonstrated that 6’-sialyllactose also increases muscle mass and 
contractility”. 

Several HMO structures have been produced chemically and 


7 JULY 2016 | VOL 535 | NATURE | 51 


© 2016 Macmillan Publishers Limited. All rights reserved. 


INSIGHT | PERSPECTIVE 


enzymatically’'. However, producing the wide array of structures 
encountered in human milk is not yet commercially feasible. There is 
an approximately 25% overlap between bovine milk oligosaccharide 
and HMO structures. Sialylated oligosaccharides are present in mature 
human milk at concentrations that are up to 20-fold greater than in 
mature bovine milk””. Therefore, bovine-milk-derived infant for- 
mulas, as well as complementary or therapeutic foods that are used to 
treat children with undernutrition, are deficient in these compounds. 
However, bovine milk oligosaccharides (BMOs) that have structural 
similarity to HMOs are present in the by-products of dairy process- 
ing, providing an opportunity to purify them at a scale sufficient for 
preclinical and clinical studies, and potentially for wider distribution 
should such studies demonstrate sufficient safety and efficacy, and yield 
an understanding of their mechanism of action. 

A study of gnotobiotic animals has provided direct evidence that 
sialylated milk oligosaccharides are causally related to growth”. Young 
germ-free mice and newborn germ-free piglets were colonized with 
members of the gut microbiota of a Malawian infant who exhibited 
stunted growth. Recipient animals were fed a diet representative 
of foods consumed after weaning by Malawians, with or without 
supplementation with sialylated BMOs that had been purified from a 
whey waste stream generated during the manufacture of cheese. The 
study revealed that sialylated BMOs promote lean body-mass gain, 
improve metabolic flexibility” and affect bone growth. These effects 
were not ascribable to differences in food consumption. They were also 
microbiota-dependent: they were not observed in germ-free animals. 
Moreover, growth promotion was not observed when the animals were 
provided an isocaloric Malawian diet supplemented with a mixture of 
fructo-oligosaccharides, a component of some infant formulas. 


The milk-oriented microbiota 

The initial microbiota of nursing infants is an assemblage of microbes 
derived from mother’s faecal, vaginal and skin microbiota”. Within 
weeks, promicrobial and antimicrobial agents in breast milk help to 
guide development of a milk-oriented microbiota. A common enrich- 
ment involves members of the Actinobacteria, mainly Bifidobacterium 
species, that frequently dominate the gut microbiota of breastfed infants, 
in some cases representing 70-90% of the faecal community”. Intrigu- 
ingly, this enrichment is less pronounced in infants from more indus- 
trialized countries””*. Bifidobacterial enrichment is linked to maternal 
genotype; the breast milk of secretors seems to enrich bifidobacteria 
more rapidly”®. 

Several beneficial functions have been attributed to a milk-oriented 
microbiota that is dominated by bifidobacteria. For example, lactate 
and acetate, the primary end products of bifidobacterial fermentation, 
are important sources of energy for colonocytes. They also lower intes- 
tinal pH and contribute to gut barrier function”. Robust colonization 
by a single bifidobacterial subspecies, Bifidobacterium longum subsp. 
infantis, correlates with improved vaccine responses during the first 
year of life”. Intestinal bifidobacteria also produce essential nutrients, 
including folate and riboflavin®. 

Two dominant species of Bifidobacterium, B. longum and B. breve, 
routinely colonize breastfed infants throughout the world, although 
other species, including B. bifidum, B. catenulatum and B. pseudo- 
catenulatum are also commonly observed. In general, bifidobacteria 
are prolific consumers of HMOs; they possess an array of glycoside 
hydrolases (notably fucosidases and sialidases*) that catalyze the cleav- 
age of key glycosidic linkages, permitting metabolism of some or all of 
the sugar monomers that are embedded in HMOs. The mechanisms 
for HMO consumption by these organisms follow two different strate- 
gies”. B. longum subsp. infantis and, to a lesser extent, B. longum subsp. 
longum, B. breve and B. pseudocatenulatum, transport HMOs directly 
into the cell through ATP-binding cassette (ABC) transporters and 
cleave these oligosaccharides with intracellular glycoside hydrolases 
(Fig. 1b, left)**. By contrast, B. bifidum deploys glycoside hydrolases 
to the cell wall for extracellular cleavage of HMOs before importing 


52 | NATURE | VOL 535 | 7 JULY 2016 


selected products of degradation. Similarly, Bacteroides species, another 
important set of HMO consumers (and frequent members of milk-ori- 
ented microbiota), also deploy external glycoside hydrolases to degrade 
these structures before they are internalized (Fig. 1b, right)”. 

The ‘internalize, then degrade approach for HMO consumption 
adopted by the majority of infant-borne bifidobacteria can be viewed as 
an ingenious strategy for protecting the neonate. These bacteria prevent 
growth of competitor strains by simple sequestration of available sugar 
substrates in the colon, a concept consistent with the inverse corre- 
lation observed between faecal HMO concentrations and the level of 
bifidobacterial colonization“. An important consideration is whether 
there are deleterious consequences of harbouring a milk-oriented 
microbiota that is dominated by bacteria that degrade HMO exter- 
nally. An antibiotic-treated mouse model has been used to show that 
mucins, large glycoproteins that contain structures similar to those of 
HMOs, can be externally degraded by Bacteroides spp. to release fucose 
and sialic acid monomers that cross-feed various pathogenic bacteria®. 
External degradation of HMOs could lead to growth of pathogens or 
pathobionts in the low-diversity neonatal gut microbiota. Three recent 
studies point to this potential risk. In gnotobiotic mice that were colo- 
nized with the microbiota from a Malawian infant with stunted growth, 
external degradation of sialylated BMOs by Bacteroides fragilis released 
the constituent monosaccharides, including sialic acid, that cross-fed 
Escherichia coli populations”. Others have observed Bacteroides cross- 
feeding Enterobacteriaceae in mice that are fed sialyllactose (an oligo- 
saccharide common to mammalian milks) and in nursing piglets***”. 
Enterobacteriaceae are considered by some researchers to be a harbinger 
of dysbiosis™. 

These findings suggest that the potential for bacterial cross-feeding 
on HMOs may be a risk factor for neonates. They also illustrate the 
extreme caution that should be afforded when composing diets for 
neonates that harbour low-diversity gut microbiota during early 
stages of community development. In cases in which a single oligo- 
saccharide prebiotic is being considered, such as fucosyllactose or 
sialyllactose in infant formula, it would help to know the composition 
of the infant milk-oriented microbiota to avoid potential cross-feeding 
of enteropathogens. Alternatively, this problem might be alleviated by 
the use of synbiotics (a combination of pre- and probiotics) in which the 
probiotic component is known to readily consume the oligosaccharides 
provided or derived monomers. 

Several challenging questions need to be addressed. First, we know 
very little about the functions of various HMO structures or why mam- 
malian evolution has produced such a diverse repertoire. Even more 
diversity could exist given the number of possible glycosidic linkages, 
suggesting that observed HMO structures were selected for by evolu- 
tion. Second, we need to better characterize the interactions and relative 
effect sizes of the antimicrobial and promicrobial components of breast 
milk on development of the milk-oriented microbiota. One approach 
for addressing these questions is to use gnotobiotic animals colonized 
with milk-oriented microbiota from infants representing different ges- 
tational ages, milk-feeding histories and growth phenotypes. Alterna- 
tively, gnotobiotic animals could be colonized with defined collections 
of cultured bacterial strains, recovered from a given donor's microbiota; 
these clonally arrayed collections can be manipulated so that all mem- 
bers, or subsets of members, are added — with or without pathogens 
and pathobionts — to recipient animals (Fig. 2). Gnotobiotic recipi- 
ents colonized with these communities can be fed breast milk or infant 
formula supplemented with defined milk oligosaccharide structures. 
(Many of the antimicrobial elements of breast milk, including antibod- 
ies, lactoferrin and lysozyme, are absent from such formulas.) These 
models represent one way for determining the rules that govern early 
phases of development of the human gut microbiota. 


The weaning- oriented microbiota and beyond 
Culture-independent studies have characterized a programme of gut 
microbial community development that is executed during the first 


© 2016 Macmillan Publishers Limited. All rights reserved. 


2-3 years of postnatal life, as infants move from a diet dominated by 
milk through a period of complementary feeding to a fully weaned 
state. In one study, monthly collection of faecal samples from mem- 
bers of a Bangladeshi birth cohort with healthy growth phenotypes 
allowed the generation of 16S rRNA-sequence-based data sets that 
described the bacterial composition of their developing gut commu- 
nities”. This study used Random Forests-based models to identify a 
group of age-discriminatory bacterial strains, the relative abundances 
of which defined the state of development (‘age’) of a child’s microbiota. 
Remarkably, many of these age-discriminatory strains were also pre- 
sent in models of normal microbiota development in healthy Malawian 
infants and children”. 

Deviations from normal can be expressed in the form of a micro- 
biota-for-age Z-score (MAZ). Calculating MAZ scores disclosed that 
microbiota development was impaired in Malawian and Bangladeshi 
children who presented with moderate or severe acute malnutri- 
tion”. Their microbial communities appeared ‘younger’ than would 
be expected from their chronological age. Moreover, this microbiota 
immaturity is not durably repaired by treatment with current ready- 
to-use therapeutic foods”. Transplanting immature microbiota 
from Malawian children who are stunted or underweight, or from 
chronologically age-matched donors who have healthy growth phe- 
notypes, to young germ-free mice fed a diet resembling that consumed 
by the microbiota donors showed that immature microbiota transmit 
impaired growth phenotypes”. 

These and other studies provide preclinical proof-of-concept that 
gut microbiota development is causally related to healthy growth*””*. 
They also provide a microbial measure of normal as well as perturbed 
postnatal development. An important question is how microbiota 
development affects development of the immune system. This issue 
can be addressed in part by defining IgA responses of the gut mucosa 
to members of the microbiota”, using faecal samples serially collected 


Population of interest 


oN 


Samples of faecal, oral or 
vaginal microbiota 


Randomized or rationally 
designed culture subsets 


re © \ 

Clonally arrayed bacterial © © } 

culture collection q Si 
@e\ = 

(eee \ f @ \ 

(ees \ 

\ © © } 


Figure 2 | Discovery pipeline for characterizing the functional properties 
of developing human microbial communities. Samples of intact, uncultured 
microbiota are obtained from infants and children with healthy growth 
phenotypes and normal microbial community development and from those 
with perturbed community development, or from their mothers. Clonally 
arrayed collections of cultured organisms are then generated from these 
microbial communities. The effects of different community configurations 


PERSPECTIVE INSIGHT | 


from members of birth-cohort studies”. This approach represents one 
way to identify relationships between microbial community develop- 
ment, development of the immune system, breast milk HMO content 
and host growth phenotypes. 


A call for human microbial community observatories 
Characterizing normal gut microbiota development and the develop- 
ment of other body-habitat-associated microbial communities in mem- 
bers of birth cohorts provides a framework for exploring the degree to 
which these processes vary across populations of infants and children 
with healthy growth phenotypes. Whether —and how — perturbations 
of these programmes are related to growth faltering and the risk for and 
development of various diseases can also be investigated. These studies 
should include an examination of the mother and her microbial com- 
munities starting at the time of conception, and of the impact of these 
communities on fetal development. The results could yield insights 
about as yet unappreciated microbial contributions to a wide range 
of disorders that are overtly manifest, or foreshadowed, by changes of 
microbial community structure and function in infancy or childhood 
(for example, obesity’, immunological disorders including atopic 
states”° and neurodevelopmental disorders”). 

Given the dramatic, myriad and rapid changes in our lifestyles 
wrought by globalization, as well as the vast differences in sanitation 
and hygiene experienced by various populations, we propose that a 
series of ‘human microbial observatories’ be established to characterize 
the evolution of microbial communities in mothers before, during and 
after pregnancy, to monitor fetal development and to characterize the 
development of microbial communities in their offspring (and perhaps 
in the future, in the pregnancies of these children and their offspring). 
We propose that the populations selected for study should not only 
illustrate currently distinct lifestyles and geographies, but also contain 
segments that are likely to undergo lifestyle changes within a generation. 


Transmission of microbiota- 
dependent phenotypes between 
generations determined 


Microbes transferred 
to offspring 


——__> 


Transplantation into 


: Microbial community characteristics 
germ-free animals 


e Membership and stability 
© Gene expression (RNA and protein) 
e Metabolic features 


Effects on host phenotypes 
e Growth 

e Metabolism 

e Behavorial phenotypes 


Features of innate and adaptive immunity 
e Targeting of bacteria by IgA 

e Susceptibility to pathogenic bacteria 

e Responsiveness to vaccines 


Representative diets 
e +/- HMOs 


e +/- Dietary ingredients 
e +/— Antibiotics 


on host biology are tested by transplanting these collections, or subsets of the 
collections, into germ-free animals (mice or other species). Recipient animals 
are fed diets representative of those consumed by their microbiota donors, 

or diets designed to test hypotheses about the role of various components, 
including HMOs, on microbiota-mediated functions. Follow-up studies can be 
performed by assessing the transmission of microbial communities of interest 
and associated phenotypes to the offspring of these gnotobiotic animals. 


7 JULY 2016 | VOL 535 | NATURE | 53 


© 2016 Macmillan Publishers Limited. All rights reserved. 


INSIGHT | PERSPECTIVE 


Organizations, both private and public, that are committed to address- 
ing global health challenges have already made investments that have 
enabled durable, trusting relationships to be established between 
health-care providers and such populations, as well as the infrastructure 
required to obtain informed consent and apply validated procedures 
for collecting and archiving biospecimens and associated metadata. 
Examples include the Global Enteric Multicenter Study (GEMS)”; the 
Etiology, Risk Factors, and Interactions of Enteric Infections and Malnu- 
trition and the Consequences for Child Health (MAL-ED) Study”; and 
various water, sanitation and hygiene (WASH) programmes”. These 
investments should be leveraged for the proposed human microbial 
observatories, which will require sustained support given the extended 
period of observation. Effective and innovative strategies for achieving 
such durable support require expertise from multiple disciplines. In our 
opinion, the development of these strategies is a compelling challenge 
whose solutions have broad implications for obtaining answers to this 
biological question as well as myriad others related to the promotion of 
human flourishing (eudaimonia) in the broadest sense. 

Wise and effective stewardship of human microbial resources is a 
responsibility that extends across generations and national bounda- 
ries. Knowledge of how microbial communities evolve in health and 
how their development is jeopardized or overtly disrupted provides 
an opportunity to discover strategies and tools for their timely repair. 
However, understanding how such repair can be achieved brings great 
responsibility. The immediate as well as long-term consequences of such 
interventions applied early in the course of a human life need to be 
determined. Rigorous tests of safety and efficacy have to be designed 
and applied in representative animal models when available. Thoughtful 
consideration must be given to the ethical, regulatory and societal issues 
and consequences that could arise from early interventions that shape 
the composition and function of our microbial communities. This is a 
time for inspiration and awe as we gain insight about how we function as 
holobionts. It is also a time for mindfulness and sobriety as we consider 
how to deliberately shape facets of our own developmental biology to 
improve wellness during our human lifecycle. m 


Received 2 November 2015; accepted 25 April 2016. 


1. Sender, R., Fuchs, S. & Milo, R. Are we really vastly outnumbered? Revisiting the 
ratio of bacterial to host cells in humans. Cel! 164, 337-340 (2016). 

2. Qin, J. etal. Ahuman gut microbial gene catalogue established by 
metagenomic sequencing. Nature 464, 59-65 (2010). 

3. The Human Microbiome Project Consortium. Structure, function and diversity 
of the healthy human microbiome. Nature 486, 207-214 (2012). 

4. Gordon, J., Knowlton, N., Relman, D. A., Rohwer, F. & Youle, M. Superorganisms 
and holobionts. Microbe 8, 152-153 (2013). 

5. Levison, M. E., Corman, L. C., Carrington, E. R. & Kaye, D. Quantitative microflora 
of the vagina. Am. J. Obstet. Gynecol. 127, 80-85 (1977). 

6. Ravel, J. et al. Vaginal microbiome of reproductive-age women. Proc. Nat! Acad. 
Sci. USA 108 (suppl. 1), 4680-4687 (2011). 

7.  Dareng, E. O. et a/. Prevalent high-risk HPV infection and vaginal microbiota in 
Nigerian women. Epidemiol. Infect. 144, 123-137 (2016). 

8. Koren, O. et a/. A guide to enterotypes across the human body: meta-analysis of 
microbial community structures in human microbiome datasets. PLoS Comput. 
Biol. 9, €1002863 (2013). 

9. Hillier, S. L. et al. Association between bacterial vaginosis and preterm delivery 
of a low-birth-weight infant. N. Engl. J. Med. 333, 1737-1742 (1995). 

10. Horner-Devine, M. C. & Bohannan, B. J. Phylogenetic clustering and 
overdispersion in bacterial communities. Ecology 87, S100-S108 (2006). 

11. Stowell, S. R. et a/. Microbial glycan microarrays define key features of host- 
microbial interactions. Nature Chem. Biol. 10, 470-476 (2014). 

12. Romero, R. et al. The vaginal microbiota of pregnant women who subsequently 
have spontaneous preterm labor and delivery and those with a normal delivery 
at term. Microbiome 2, 18 (2014). 

13. Romero, R. et al. The composition and stability of the vaginal microbiota 
of normal pregnant women is different from that of non-pregnant women. 
Microbiome 2, 4 (2014). 

14. DiGiulio, D. B. et al. Temporal and spatial variation of the human microbiota 
during pregnancy. Proc. Natl Acad. Sci. USA 112, 11060-11065 (2015). 
This study showed that the composition of the vaginal microbiota early in 
pregnancy may predict subsequent premature birth, which raises questions 
about how this community of microbes shapes maternal health and 
pregnancy outcomes. 

15. Aagaard, K. et al. A metagenomic approach to characterization of the vaginal 
microbiome signature in pregnancy. PLoS ONE 7, e36466 (2012). 


54 | NATURE | VOL 535 | 7 JULY 2016 


16. 


17. 


18. 
19. 


20. 


21. 


22. 


23. 


24. 


25. 


26. 


27. 


28. 


29. 


30. 


31. 


32. 


33: 


34. 


35. 


36. 


37. 


38. 


39. 


40. 


41. 
42. 


43. 


44. 


Wilson, C. S. Nutritionally beneficial cultural practices. World Rev. Nutr. Diet. 45, 
68-96 (1985). 

Bisanz, J. E. et al. Microbiota at multiple body sites during pregnancy in a rural 
Tanzanian population and effects of moringa-supplemented probiotic yogurt. 
Appl. Environ. Microbiol. 81, 4965-4975 (2015). 

Macintyre, D. A. et al. The vaginal microbiome during pregnancy and the 
postpartum period in a European population. Sci. Rep. 5, 8988 (2015). 
Huang, Y. E. et al. Homogeneity of the vaginal microbiome at the cervix, 
posterior fornix, and vaginal canal in pregnant Chinese women. Microb. Ecol. 
69, 407-414 (2015). 

Burke, B. S. & Stevenson, S. S. Nutrition studies during pregnancy; relation of 
maternal nutrition to condition of infant at birth; study of siblings. J. Nutr. 38, 
453-467 (1949). 

Koren, O. et al. Host remodeling of the gut microbiome and metabolic changes 
during pregnancy. Cell 150, 470-480 (2012). 

Liu, B. et al. Deep sequencing of the oral microbiome reveals signatures of 
periodontal disease. PLoS ONE 7, e37919 (2012). 

Duran-Pinedo, A. E. et al. Community-wide transcriptome of the oral 
microbiome in subjects with and without periodontitis. /SME J. 8, 1659-1672 
(2014). 

Siqueira, F. M. et al. Intrauterine growth restriction, low birth weight, and 
preterm birth: adverse pregnancy outcomes and their association with 
maternal periodontitis. J. Periodontol. 78, 2266-2276 (2007). 

DiGiulio, D. B. et al. Microbial prevalence, diversity and abundance in amniotic 
fluid during preterm labor: a molecular and culture-based investigation. PLoS 
ONE 3, e3056 (2008). 

DiGiulio, D. B. et al. Prevalence and diversity of microbes in the amniotic fluid, 
he fetal inflammatory response, and pregnancy outcome in women with 
preterm pre-labor rupture of membranes. Am. J. Reprod. Immunol. 64, 38-57 
(2010). 

Han, Y. W. et al. Transmission of an uncultivated Bergeyella strain from the 

oral cavity to amniotic fluid in a case of preterm birth. J. Clin. Microbiol. 44, 
1475-1483 (2006). 

Han, Y. W., Shen, T., Chung, P., Buhimschi, |. A. & Buhimschi, C. S. Uncultivated 
bacteria as etiologic agents of intra-amniotic inflammation leading to preterm 
birth. J. Clin. Microbiol. 47, 38-47 (2009). 

Swati, P, Thomas, B., Vahab, S. A., Kapaettu, S. & Kushtagi, P. Simultaneous 
detection of periodontal pathogens in subgingival plaque and placenta of 
women with hypertension in pregnancy. Arch. Gynecol. Obstet. 285, 613-619 
(2012). 
Costalonga, M. & Herzberg, M. C. The oral microbiome and the immunobiology 
of periodontal disease and caries. Immunol. Lett. 162, 22-38 (2014). 

Teng, F. et al. Prediction of early childhood caries via spatial-temporal variations 
of oral microbiota. Cell Host Microbe 18, 296-306 (2015). 

Kustner, O. Beitrag zur Lehre von der puerperalen Infection der Neugeborenen. 
Arch. Gynakol. 11, 256-263 (1877). 

Harris, J. W. & Brown, J. H. The bacterial content of the uterus at cesarean 
section. Am. J. Obstet. Gynecol. 13, 133-143 (1927). 

Benirschke, K. Routes and types of infection in the fetus and the newborn. AMA 
J. Dis. Child. 99, 714-721 (1960). 
Verstraelen, H. et al. Characterisation of the human uterine microbiome in non- 
pregnant women through deep sequencing of the V1-2 region of the 16S rRNA 
gene. PeerJ 4, e1602 (2016). 
itchell, C. M. et a/. Colonization of the upper genital tract by vaginal bacterial 
species in nonpregnant women. Am. J. Obstet. Gynecol. 212, 611.e1-611.e9 
(2015). 
Stout, M. J. et al. Identification of intracellular bacteria in the basal plate of the 
human placenta in term and preterm gestations. Am. J. Obstet. Gynecol. 208, 
226.e1-226.e7 (2013). 

Aagaard, K. et al. The placenta harbors a unique microbiome. Sci. Trans/. Med. 6, 
2371ra65 (2014). 

Bhola, K. et a/. Placental cultures in the era of peripartum antibiotic use. Aust. N. 
Z. J. Obstet. Gynaecol. 48, 179-184 (2008). 

Kliman, H. J. Comment on “The placenta harbors a unique microbiome”. Sci. 
Transl. Med. 6, 254le4 (2014). 

Salter, S. J. et al. Reagent and laboratory contamination can critically impact 
sequence-based microbiome analyses. BMC Biol. 12, 87 (2014). 

Prince, A. L. et al. The placental microbiome is altered among subjects with 
spontaneous preterm birth with and without chorioamnionitis. Am. J. Obstet. 
Gynecol. 214, 627.e1-627.e16 (2016). 

Kim, M. J. et al. Widespread microbial invasion of the chorioamniotic 
membranes is a consequence and not a cause of intraamniotic infection. Lab. 
Invest. 89, 924-936 (2009). 

Hansen, R. et al. First-pass meconium samples from healthy term vaginally- 
delivered neonates: an analysis of the microbiota. PLoS ONE 10, e€0133320 
(2015). 


. Ardissone, A. N. et al. Meconium microbiome analysis identifies bacteria 


correlated with premature birth. PLoS ONE 9, e90784 (2014). 


. Romero, R. et al. The role of infection in preterm labour and delivery. Paediatr. 


Perinat. Epidemiol. 15 (suppl. 2), 41-56 (2001). 


. Bearfield, C., Davenport, E. S., Sivapathasundaram, V. & Allaker, R. P. Possible 


association between amniotic fluid micro-organism infection and microflora in 
the mouth. BJOG 109, 527-533 (2002). 


. Menon, R., Peltier, M. R., Eckardt, J. & Fortunato, S. J. Diversity in cytokine 


response to bacteria associated with preterm birth by fetal membranes. Am. J. 
Obstet. Gynecol. 201, 306.e1-306.e6 (2009). 


© 2016 Macmillan Publishers Limited. All rights reserved. 


49. 


50. 


51. 


52. 


53. 


54. 


55. 


56. 


OT: 


58. 


59. 


60. 


61. 


62. 


63. 


64. 


65. 


66. 


67. 


68. 


69. 


70. 


71. 


72. 


73. 


74. 


75. 


76. 


Th fe 


Gomez de Agiiero, M. et al. The maternal microbiota drives early postnatal 
innate immune development. Science 351, 1296-1302 (2016). 

A germ-free mouse model of transient microbial colonization demonstrates 
that exposure of the mother to microbes during pregnancy shapes 
immunological development and function in the neonate. 

Muglia, L. J. & Katz, M. The enigma of spontaneous preterm birth. N. Engl. J. 
Med. 362, 529-535 (2010). 

McGuire, M. K. & McGuire, M. A. Human milk: mother nature’s prototypical 
probiotic food? Adv. Nutr. 6, 112-123 (2015). 

Palmer, C., Bik, E. M., DiGiulio, D. B., Relman, D. A. & Brown, P. 0. Development 
of the human infant intestinal microbiota. PLoS Biol. 5, e177 (2007). 

Wu, S., Tao, N., German, J. B., Grimm, R. & Lebrilla, C. B. Development of an 
annotated library of neutral human milk oligosaccharides. J. Proteome Res. 9, 
4138-4151 (2010). 

Wu, S., Grimm, R., German, J. B. & Lebrilla, C. B. Annotation and structural 
analysis of sialylated human milk oligosaccharides. J. Proteome Res. 10, 
856-868 (2011). 

Kunz, C. & Rudloff, S. Biological functions of oligosaccharides in human milk. 
Acta Paediatr. 82, 903-912 (1993). 

inonuevo, M. R. et al. A strategy for annotating the human milk glycome. 

J. Agric. Food Chem. 54, 7471-7480 (2006). 

Totten, S. M. et al. Comprehensive profiles of human milk oligosaccharides 
yield highly sensitive and specific markers for determining secretor status in 
actating mothers. J. Proteome Res. 11, 6124-6133 (2012). 

Bode, L. Human milk oligosaccharides: every baby needs a sugar mama. 
Glycobiology 22, 1147-1162 (2012). 

Thurl, S. et a/. Variation of human milk oligosaccharides in relation to milk 
groups and lactational periods. Br J. Nutr. 104, 1261-1271 (2010). 

Totten, S. M. et al. Rapid-throughput glycomics applied to human milk 
oligosaccharide profiling for large human studies. Anal. Bioanal. Chem. 406, 
7925-7935 (2014). 

This paper highlights nanoflow liquid chromatography mass spectrometry, 
a method that allows the rapid and reproducible detection of HMOs in low- 
volume biological samples, enabling large-scale clinical studies. 

Coppa, G. V. et a/. Changes in carbohydrate composition in human milk over 
4 months of lactation. Pediatrics 91, 637-641 (1993). 

Nifionuevo, M. R. et al. Daily variations in oligosaccharides of human milk 
determined by microfluidic chips and mass spectrometry. J. Agric. Food Chem. 
56, 618-626 (2008). 

Chaturvedi, P. et al. Fucosylated human milk oligosaccharides vary between 
individuals and over the course of lactation. Glycobiology 11, 365-372 (2001). 
De Leoz, M. L. et a/. Lacto-N-tetraose, fucosylation, and secretor status are 
highly variable in human milk oligosaccharides from women delivering 
preterm. J. Proteome Res. 11, 4662-4672 (2012). 

Charbonneau, M. R. et al. Sialylated milk oligosaccharides promote microbiota- 
dependent growth in models of infant undernutrition. Cel! 164, 859-87 1 
(2016). 

Gnotobiotic mouse and piglet models were used to show that sialylated milk 
oligosaccharides play a causal, microbiota-dependent role in lean body-mass 
gain, bone growth and metabolism. 

Gal, B. et a/. Development changes in UDP-N-acetylglucosamine 2-epimerase 
activity of rat and guinea-pig liver. Comp. Biochem. Physiol. B 108, 13-15 
(1997). 

Wang, B. Sialic acid is an essential nutrient for brain development and 
cognition. Annu. Rev. Nutr. 29, 177-222 (2009). 

Wang, B. & Brand-Miller, J. The role and potential of sialic acid in human 
nutrition. Eur. J. Clin. Nutr. 57, 1351-1369 (2003). 

Wang, B. et al. Dietary sialic acid supplementation improves learning and 
memory in piglets. Am. J. Clin. Nutr. 85, 561-569 (2007). 

Yonekawa, T. et a/. Sialyllactose ameliorates myopathic phenotypes in 
symptomatic GNE myopathy model mice. Brain 137, 2670-2679 (2014). 
Chen, X. Human milk oligosaccharides (HMOS): structure, function, and 
enzyme-catalyzed synthesis. Adv. Carbohydr. Chem. Biochem. 72, 113-190 
(2015). 

Aldredge, D. L. et a/. Annotation and structural elucidation of bovine milk 
oligosaccharides and determination of novel fucosylated structures. 
Glycobiology 23, 664-676 (2013). 

Sundekilde, U. K. et a/. Natural variability in bovine milk oligosaccharides from 
Danish Jersey and Holstein-Friesian breeds. J. Agric. Food Chem. 60, 6188- 
6196 (2012). 

Muoio, D. M. Metabolic inflexibility: when mitochondrial indecision leads to 
metabolic gridlock. Ce// 159, 1253-1262 (2014). 

Mueller, N. T., Bakacs, E., Combellick, J., Grigoryan, Z. & Dominguez-Bello, M. 
G. The infant microbiome development: mom matters. Trends Mol. Med. 21, 
109-117 (2015). 

Lewis, Z. T. et a/. Maternal fucosyltransferase 2 status affects the gut 
bifidobacterial communities of breastfed infants. Microbiome 3, 13 (2015). 
Huda, M. N. et al. Stool microbiota and vaccine responses of infants. Pediatrics 
134, e362-e372 (2014). 


78. 


79. 


80. 


81. 


82. 


83. 


84. 


85. 


86. 


87. 


88. 


89. 


90. 


91. 


92. 


93. 


94. 


95. 


96. 


97. 


98. 


99. 


PERSPECTIVE INSIGHT | 


Subramanian, S. et al. Persistent gut microbiota immaturity in malnourished 
Bangladeshi children. Nature 510, 417-421 (2014). 

This study used a machine-learning approach to define normal microbiota 
development in Bangladeshi infants and children and revealed a persistent 
defect in microbiota development in children that exhibit undernutrition. 
Fukuda, S. et al. Bifidobacteria can protect from enteropathogenic infection 
through production of acetate. Nature 469, 543-547 (2011). 

Sugahara, H., Odamaki, T., Hashikura, N., Abe, F. & Xiao, J. Z. Differences in folate 
production by bifidobacteria of different origins. Biosci. Microbiota Food Health 
34, 87-93 (2015). 

Garrido, D., Dallas, D. C. & Mills, D. A. Consumption of human milk 
glycoconjugates by infant-associated bifidobacteria: mechanisms and 
implications. Microbiology 159, 649-664 (2013). 

Garrido, D. et al. Comparative transcriptomics reveals key differences in the 
response to milk oligosaccharides of infant gut-associated bifidobacteria. Sci. 
Rep. 5, 13517 (2015). 

Marcobal, A. et al. Bacteroides in the infant gut consume milk oligosaccharides 
via mucus-utilization pathways. Cel! Host Microbe 10, 507-514 (2011). 

De Leoz, M. L. et al. Human milk glycomics and gut microbial genomics in 
infant feces show a correlation between human milk oligosaccharides and gut 
microbiota: a proof-of-concept study. J. Proteome Res. 14, 491-502 (2015). 
Ng, K. M. et a/. Microbiota-liberated host sugars facilitate post-antibiotic 
expansion of enteric pathogens. Nature 502, 96-99 (2013). 

Frese, S. A. & Mills, D. A. Should infants cry over spilled milk? Fecal glycomics as 
an indicator of a healthy infant gut microbiome. J. Pediatr. Gastroenterol. Nutr. 
60, 695 (2015). 

Frese, S. A., Parker, K., Calvert, C. C. & Mills, D. A. Diet shapes the gut 
microbiome of pigs during nursing and weaning. Microbiome 3, 28 (2015). 
Shin, N. R., Whon, T. W. & Bae, J. W. Proteobacteria: microbial signature of 
dysbiosis in gut microbiota. Trends Biotechnol. 33, 496-503 (2015). 

Blanton, L. V. et al. Gut bacteria that prevent growth impairments transmitted 
by microbiota from malnourished children. Science 351, aad3311 (2016). 
Schwarzer, M. et al. Lactobacillus plantarum strain maintains growth of infant 
mice during chronic undernutrition. Science 351, 854-857 (2016). 

Kau, A. L. et al. Functional characterization of IgA-targeted bacterial taxa from 
undernourished Malawian children that produce diet-dependent enteropathy. 
Sci. Transl. Med. 7, 276ra24 (2015). 

Planer, J. D et a/. Development of the gut microbiota and mucosal IgA responses 
in twins and gnotobiotic mice. Nature 534, 263-266 (2016). 

Cox, L. M. et a/. Altering the intestinal microbiota during a critical developmental 
window has lasting metabolic consequences. Cel/ 158, 705-721 (2014). 
Dogra, S. et al. Dynamics of infant gut microbiota are influenced by delivery 
mode and gestational duration and are associated with subsequent adiposity. 
mBio 6, €02419-14 (2015). 

Cho, I. et al. Antibiotics in early life alter the murine colonic microbiome and 
adiposity. Nature 488, 621-626 (2012). 

Arrieta, M. et al. Early infancy microbial and metabolic alterations affect risk of 
childhood asthma. Sci. Trans/. Med. 7, 307ra152 (2015). 

Goyal, M. S., Venkatesh, S., Milbrandt, J., Gordon, J. |. & Raichle, M. E. Feeding 
the brain and nurturing the mind: linking nutrition and the gut microbiota to 
brain development. Proc. Nat! Acad. Sci. USA 112, 14105-14112 (2015). 
Levine, M. M., Kotloff, K. L., Nataro, J. P.& Muhsen, K. The Global Enteric 
Multicenter Study (GEMS): impetus, rationale, and genesis. Clin. Infect. Dis. 55 
(suppl. 4), S215-S224 (2012). 

MAL-ED Network Investigators. The MAL-ED study: a multinational and 
multidisciplinary approach to understand the relationship between 

enteric pathogens, malnutrition, gut physiology, physical growth, cognitive 
development, and immune responses in infants and children up to 2 years of 
age in resource-poor environments. Clin. Infect. Dis. 59 (suppl. 4), S193-S206 
(2014). 

This paper describes a large, multi-site birth cohort study that includes 

an effort to serially sample microbial communities in infants to identify 
correlations between the composition and the development of the microbiota, 
postnatal growth phenotypes and other facets of health. 


100.Ngure, F. M. et al. Water, sanitation, and hygiene (WASH), environmental 


enteropathy, nutrition, and early child development: making the links. Ann. NY 
Acad. Sci. 1308, 118-128 (2014). 


Acknowledgements Work cited from the authors’ laboratories was supported in 
part by grants from the US National Institutes of Health (DK30292, HD061923, 
AT007079 and ATO08759), the Bill & Melinda Gates Foundation, the March 

of Dimes Foundation and the Thomas C. and Joan M. Merigan Endowment at 
Stanford University. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare competing financial interests: see 
go.nature.com/28itcop. Readers are welcome to comment on the online version of 
this paper at go.nature.com/28itcop. Correspondence should be addressed to J.I.G. 
(jgordon@wustl.edu). 


7 JULY 2016 | VOL 535 | NATURE | 55 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


doi:10.1038/nature18846 


Diet-microbiota interactions as 
moderators of human metabolism 


Justin L. Sonnenburg’ & Fredrik Backhed”? 


It is widely accepted that obesity and associated metabolic diseases, including type 2 diabetes, are intimately linked to 
diet. However, the gut microbiota has also become a focus for research at the intersection of diet and metabolic health. 
Mechanisms that link the gut microbiota with obesity are coming to light through a powerful combination of translation- 
focused animal models and studies in humans. A body of knowledge is accumulating that points to the gut microbiota as a 
mediator of dietary impact on the host metabolic status. Efforts are focusing on the establishment of causal relationships 
in people and the prospect of therapeutic interventions such as personalized nutrition. 


orldwide, obesity has more than doubled since 1980 accord- 
We to the World Health Organization. In 2014, more than 

1.9 billion adults were overweight, and over 600 million of 
those people were obese. Obesity results from a positive energy balance, 
which occurs when the amount of energy ingested exceeds the amount 
expended, and it is a strong risk factor for other metabolic complications 
such as type 2 diabetes. Type 2 diabetes is increasing in prevalence in 
low-income countries, and in 2014, approximately 422 million adults 
worldwide had diabetes. The condition is characterized by high blood 
sugar, resistance to insulin and a relative lack of insulin. Insulin resist- 
ance is also associated with an increased flux of free fatty acids that 
contribute to diabetic dyslipidaemia, which is characterized by a high 
concentration of triglycerides in blood plasma, a low concentration of 
high-density lipoprotein (HDL) cholesterol and an increased concen- 
tration of small, dense low-density lipoprotein cholesterol particles’. 
Dyslipidaemia is one of the major risk factors for cardiovascular disease 
in people with diabetes. Accordingly, abnormal metabolism of glucose 
and lipids is the hallmark of metabolic syndrome, which is defined by 
central (abdominal) obesity and the presence of two or more of four 
factors — elevated triglycerides, reduced HDL cholesterol, high blood 
pressure, and increased fasting blood glucose. As governments and 
health organizations struggle to find solutions to these largely prevent- 
able health issues, a rapidly expanding area of research that is focused 
on the microbes that live within our digestive tract is offering fresh and 
interesting insights and potential avenues for intervention. 

The human gut is a bioreactor with a microbiota that typically encom- 
passes hundreds or thousands of bacterial taxa, which predominantly 
belong to two phyla: Firmicutes and Bacteroidetes**. Tremendous 
strides have been taken over the past decade towards mapping the 
composition and basic functional attributes of the gut microbiota of 
people from industrialized countries*”. This ensemble of organisms has 
coevolved with the human host and complements the coding potential 
of our own genome with 500-fold more genes°. However, the annota- 
tion, and consequently the biological function, of many of these remain 
poorly defined. 

The observation that germ-free mice, which lack a microbiota, have 
reduced adiposity and improved tolerance to glucose and insulin when 
compared with conventional (colonized) counterparts’ jump-started 
a decade of research that focused on the clarification of underlying 
mechanisms. Germ-free mice are protected from diet-induced obesity 


when fed a Western-style diet* "°, which further supports a link between 


the gut microbiota and the host metabolism. The altered microbiota 
that is observed in genetically obese mice’ is sufficient to promote 
increased adiposity in lean mice that receive a microbiota transplant”, 
demonstrating that the microbiota contributes to the regulation of adi- 
posity. The importance and generalizability of these initial findings are 
strengthened by reports of alterations in the gut microbiota of obese 
people*'?”, which confer the obese or adiposity phenotypes when 
transferred to mice’*’®. 

Here, we review the large body of data that is shaping our understand- 
ing of how the gut microbiota can alter the absorption, metabolism and 
storage of calories. Despite broad agreement that gut microbes modify 
how the human body responds to components of diet to influence 
metabolism, the mechanisms that underlie this process are exception- 
ally complex and the data can be difficult to reconcile. The picture that is 
emerging suggests that obesity is associated with reduced diversity of the 
gut microbiota’*"”. Systemic inflammation and microbial metabolites, 
such as bile acids and short-chain fatty acids, are also commonly impli- 
cated. The ability to easily access and reprogramme the composition and 
function of the microbiota make it an attractive target for intervention. 


Diet as an important modulator of the gut microbiota 
Extensive research on the gut microbiota has shown that diet modulates 
the composition and function of this community of microbes in humans 
and other mammals'*”*, with the earliest literature”® published almost 
100 years ago. Human intervention studies from the past decade have 
revealed the extent to which different aspects of the microbiota can be 
influenced through dietary change; this can be summarized by three 
main themes. 

The first theme is that the microbiota of the human gut responds 
rapidly to large changes in diet. The existence of these fast, diet-induced 
dynamics is supported by evidence from people who switch between 
plant- and meat-based diets, who add more than 30 grams per day of 
specific dietary fibres to their diet or who follow either a high-fibre-low- 
fat diet or a low-fibre-high-fat diet for 10 days; in all cases, the compo- 
sition and function of the microbiota shifted over 1-2 days'*”°”’. Such 
marked shifts in response to nutrient availability are perhaps unsur- 
prising given that populations of microbes can double within an hour 
and the gut extensively purges the community every 24-48 hours. This 
responsiveness might represent an advantageous feature of enlisting 


‘Department of Microbiology and Immunology, Stanford University School of Medicine, Stanford, California 94305, USA. *Wallenberg Laboratory for Cardiovascular and Metabolic Research, 
Department of Molecular and Clinical Medicine, Institute of Medicine, Sahlgrenska Academy, University of Gothenburg, 413 45 Gothenburg, Sweden. °Section for Metabolic Receptology, Novo 
Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health Sciences, University of Copenhagen, 2200 Copenhagen, Denmark. 


56 | NATURE | VOL 535 | 7 JULY 2016 


© 2016 Macmillan Publishers Limited. All rights reserved. 


microbes as part of the digestive structure — especially when consider- 
ing the possible day-to-day variation in food that is available to foragers. 
It might also be an inescapable consequence of dealing with a complex 
and competitive microbial community that undergoes rapid turnover. 

The second theme is that, despite these rapid dynamics, long-term 
dietary habits are a dominant force in determining the composition 
of an individual's gut microbiota. Despite detectable responses of the 
microbiota within 24 hours of dietary intervention, a 10-day feeding 
study in 10 people” failed to alter the major compositional features and 
the overall classification of each participant’s microbiota. Some, but not 
all, cross-sectional studies reported that long-term dietary trends are 
linked to features of microbiota composition®”*”””*. 

The third theme is that a particular change in diet can have a highly 
variable effect on different people owing to the individualized nature 
of their gut microbiota. For example, Ruminococcus bromii-related taxa 
bloomed in response to resistant-starch intervention in most of the 14 
obese men in one study; the lack of response in the other individuals 
might reflect an absence of such taxa in those people”. A dietary inter- 
vention that includes a boosted intake of fibre and a decreased intake 
of energy can increase microbiota diversity — as defined by the gene 
content of the faecal metagenome — for individuals who start with a 
low microbiota gene content, but not those who start with a high gene 
content”. These individualized responses might fit into categories that 
enable a precision rather than a personalized approach to understanding 
responsiveness to diet. 

The influence of diet on aspects of microbiota function might also 
help to explain how a specific metabolic input can alter microbiota com- 
position over time. In a study that focused on the enzymatic activity of 
trimethylamine lyase, mice that harbour microbiotas with low produc- 
tion of trimethylamine (TMA) could be converted into high producers 
when their diet was supplemented with the TMA-containing compound 
L-carnitine for 10 weeks”. Similarly, a microbiota-encoded degradation 
system for porphyran, a polysaccharide that is found in certain spe- 
cies of edible seaweed, is rare in the microbiotas of Western people but 


Low microbiota diversity 
limits access to 
complex carbohydrates 


Intestinal 
epithelium 


Lumen @ Butyrate 


Acetate ® Bacterium 


Propionate @ 


J Fibre 
complexity 


Complex 
carbohydrates 


Figure 1 | Interactions between the diet and the gut microbiota dictate 
the production of short-chain fatty acids. Dietary fibre is a source of 
complex carbohydrates, which are required for the production of short- 
chain fatty acids such as acetate, butyrate and propionate. When the 
diversity of the microbiota is high and the diet contains many types of 
complex carbohydrates (top right), a relatively high percentage of complex 
carbohydrates will be accessible to the microbiota. But when the diversity 
of the microbiota is low and the diet contains many types of complex 
carbohydrates (left), only a low percentage of these complex carbohydrates 
are accessible to the microbiota. If the fibre composition of the diet is 
matched to the needs of a low-diversity microbiota (bottom right) by 
limiting the types of complex carbohydrate that are available, the levels of 


T Microbiota 
diversity 
ee 
e 


REVIEW 


prominent in those of populations that regularly consume seaweed”. 
This suggests that certain metabolic inputs can select for pathways as 
well as the organisms that harbour those pathways. One corollary of this 
interpretation is that there must be a reservoir of selectable functions 
— either present at low levels within the gut microbial community or 
able to invade from an environmental source. It is important to note that 
numerous other non-dietary mechanisms, such as interstrain killing 
that is mediated by the type VI secretion system, infection with bac- 
teriophages and priority effects of colonization through which strains 
are able to exclude one another on the basis of relatedness of particular 
genetic loci, can underlie microbial community dynamics and might 
interact with or operate in parallel to dietary-mediated effects”. 

Several issues can complicate the unravelling of mechanisms and the 
interpretion of data in dietary intervention studies in humans. People 
are notoriously poor at adhering to dietary regimes, and it is difficult 
to accurately measure the extent of their adherence because the self- 
assessment of food intake can be clouded by numerous factors. Budget 
limitations often mean that researchers must choose either tightly con- 
trolled studies of small cohorts, for example, in which food is provided, 
or larger cohort studies that could be confounded by the free will of the 
participants and by their self-assessment. Because dietary change often 
involves both the elimination and addition (that is, the substitution) of 
dietary components, even the most successful intervention studies can 
raise questions about which diet modification was responsible for the 
change in the microbiota. A further complication is that many of the 
dietary changes in such studies also have the potential to directly influ- 
ence host metabolism in a microbiota-independent way. 

As an alternative, animal models enable researchers to tightly control 
the diet of subjects and to have multiple biological replicates that repre- 
sent the response ofa single microbiota. Experimental models that lack 
a gut microbiota offer further power for determining whether the effects 
of diet in the host depend on the microbiota. For example, germ-free 
rats harvest less energy from a polysaccharide-rich diet™ and germ-free 
mice have a reduced adiposity despite an increased intake of food by 


Increased access to 
complex carbohydrates 


® T Diversity 
_ T Metabolic output 


J Diversity 
T Metabolic output 


production of certain short-chain fatty acids, such as propionate, might 
increase. However, the diversity of the microbiota will probably remain low 
and it might not be able to provide as many functions as a diverse microbiota. 
Consumption of a complex diet (top right) might result in increased levels 

of production of multiple types of short-chain fatty acids and helps to recruit 
additional diversity to the gut microbiota. The level of propionate production 
is correlated with the abundance of Bacteroides species in the gut, which 

is consistent with the involvement of these bacteria in the production of 
propionate’”*. Fermentation of fibre in the colon has been shown to decrease 
pH levels, which can help to increase the diversity of the gut microbiota or 
results in the reinforcement by certain taxa of a pH that favours their own 
growth'’°!””, 


7 JULY 2016 | VOL 535 | NATURE | 57 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


comparison with their colonized counterparts’, which demonstrates 
that the microbiota helps to extract energy from food. These results are 
consistent with the fact that the fermentation of dietary fibre represents 
one of the dominant microbial metabolic activities in the colon, the 
region of the gut in which the microbiota is most dense*™”’. 

The short-chain fatty acid end-products of fermentation in the gut 
can be absorbed into the circulation to serve as both microbiota-gen- 
erated calories and important regulatory molecules, and it has been 
estimated that people who consumed a typical British diet in the 1980s 
received 6-10% of their energy from short-chain fatty acids*’. By con- 
trast, people who eat large quantities of plants, the main source of 
dietary fibre*’, such as those in certain African communities that con- 
sume up to sevenfold more fibre than people the industrialized world”, 
might generate considerably more short-chain fatty acids, which there- 
fore probably contribute more to the whole-body energy requirement. 
This is in agreement with the increased abundance of taxa that ferment 
polysaccharides in the gut microbiota of African populations”. Certain 
recurrent physiological states in mammals, such as the non-hibernating 
period in bears and advanced pregnancy”, result in a markedly altered 
microbiota with an increased capacity to harvest energy from the diet 
without metabolic derangement. It should also be noted that the effects 
observed in animal models extend beyond a simple improvement in 
calorie harvest. The microbiota of mice suppress the expression of intes- 
tinal angiopoietin-like protein 4, an inhibitor of the enzyme lipopro- 
tein lipase, which increases lipoprotein-lipase activity in adipose tissue 
and promotes the storage of fat’. Accordingly, mice that are deficient in 
Angptl4 have increased adiposity, even under germ-free conditions’. 

Experiments that use a Western-style diet, which is devoid of fibres 
and rich in calories from saturated fat and sucrose, demonstrate that 
the gut microbiota regulates obesity through additional pathways*. For 
example, germ-free mice are protected from diet-induced obesity when 
fed high levels of sucrose and lard®, a diet that alters the composition of 
the gut microbiota. The presence of the microbiota is both necessary 
and sufficient for obesity: the transfer of microbiota from mice fed a 
Western diet to germ-free mice transfers the obese phenotype”. By con- 
trast, germ-free mice that are fed a high-fat diet with less sucrose are 
only partly protected against obesity”, and all protection from obesity 
(that is, microbiota-dependent obesity) is lost when sucrose is omitted 
from the diet*. The molecular mechanisms that underpin this finding 
are unknown. The source of dietary fat also seems to be important. 
Saturated and unsaturated fats have profoundly different effects on the 
gut microbiota, and the altered microbiota that results from feeding 
unsaturated fats can offer protection from lard-induced weight gain™. 
These findings suggest that simple carbohydrates and fats could exert 
unexpected effects on the host metabolism through the microbiota. Fur- 
ther research is required to clarify how microbial taxa and ecosystems 
interact with specific macronutrients. 

Emerging evidence suggests that the deleterious metabolic effects of 
processed foods might involve more than just macronutrients. Emul- 
sifiers and artificial sweeteners have been shown to be involved in the 
development of metabolic syndrome features through their modulation 
of the microbiota in mice”. In a study in seven people, artificial sweet- 
eners given at high doses resulted in insulin resistance after only 7 days”; 
however, this dramatic finding needs to be reproduced ina larger study. 
These data provide evidence that artificial food additives might contrib- 
ute to metabolic disease through disruption of the microbiota. Notably, 
an important and unwavering commonality of Western dietary trends 
is the paucity of plant-based dietary fibre**, an important fuel for the 
microbiota. The absence of dietary fibre together with an abundance of 
nutrients that negatively affect the microbiota could be of considerable 
importance for understanding metabolic diseases. 


Microbial ecology in metabolic disease 

The interaction of numerous species, the allocation of resources and the 
dynamic response to perturbation within the gut provide many of the 
hallmarks of a complex ecosystem. The application of macroecological 


58 | NATURE | VOL 535 | 7 JULY 2016 


concepts to the gut microbiota might therefore be instructive in guiding 
scientific inquiry and understanding”, particularly when considering 
the associations between microbiota diversity and metabolic output 
(such as the link between short-chain fatty acids and obesity and meta- 
bolic disease). For example, many macroecology data suggest that the 
extent of biodiversity within an ecosystem can serve as an important 
measure of stability and robustness”, which are relevant to research that 
looks at the link between gut microbes and health. 

Three metagenomic studies’*”!”' have shown that improved meta- 
bolic health is associated with a relatively high microbiota gene con- 
tent and with an increased microbial diversity. These data indicate that 
the extent of the diversity might be an important factor for metabolic 
health, which is consistent with findings from microbiota studies that 
have focused on traditional human societies. The gut microbiota of 
eight hunter-gatherer or rural farming populations in various parts of 
the world showed increased bacterial diversity compared with those 
of Western populations’”***”*. Notably, the microbial taxa that are 
absent from the Western gut are found in many populations of tra- 
ditional people that have been separated for thousands of years on 
different continents. The parsimonious explanation for this is that 
industrialization has been accompanied by an overall decline in gut 
microbiota biodiversity as well as the loss of specific phylogenetic groups 
— apotential consequence of modern lifestyles, medical practices and 
processed foods. It is unclear whether certain taxa are keystones that 
promote diversity. It is also unknown whether the increased diversity 
is only a reflection of a healthy and varied diet or whether it directly 
contributes to protection from metabolic disease. One theory is that 
the microbiota of industrialized nations are experiencing a widespread 
change in functional capacity (for instance, altered production of short- 
chain fatty acids), which is contributing to modern health issues such 
as obesity **. Dietary reinforcement, and specifically the provision of 
diverse complex carbohydrates, could provide the key to sustaining, and 
perhaps recovering, a diverse resident ecosystem that is capable of the 
functions that the human body expects or requires (Fig. 1). A caveat is 
that diversity can be measured in many ways that include or exclude the 
relative abundances of species and the functions encoded within them. 
It is also important to note that a high level of biodiversity does not 
always correspond to a health-promoting ecosystem: for example, bac- 
terial vaginosis is characterized by a diversity greater than that observed 
ina healthy state”. Undoubtedly, an understanding of diversity within 
the context of organism identity, location and function enriches the util- 
ity of measures that fail to capture important details when used alone. 


Fuel for the microbial ecosystem 

Many of the plant polysaccharides that are found within dietary fibre 
are structurally complex. It is therefore unsurprising that the numer- 
ous enzymes that are required to de-modify, liberate, transport and 
metabolize component monosaccharides are not encoded within the 
human genome”. Furthermore, the time that would be required to 
perform these steps is probably not compatible with the rapid transit 
that occurs in the small intestine, the region of the gut in which simple 
carbohydrates are digested and absorbed. Consequently, complex car- 
bohydrates travel to the distal gut for fermentation by its dense com- 
munity of microbes. 

Many complex plant carbohydrates qualify as dietary fibre, according 
to laboratory tests. However, the amount of fibre that can be metabolized 
(for example, through the enzymatic degradation of glycosidic linkages 
and the fermentation of liberated monosaccharides into short-chain fatty 
acids) will depend on many factors, including the composition of the 
microbiota. Carbohydrates that can be metabolized by the microbiota are 
known as microbiota-accessible carbohydrates” and can be contrasted 
with those that pass through the digestive tract without undergoing 
metabolic transformation. This metabolic accessibility is an important 
distinguishing characteristic: it defines a carbohydrate as a resource that 
drives the interspecies economy within the gut and it implies that meta- 
bolic products, such as short-chain fatty acids, will be generated. 


© 2016 Macmillan Publishers Limited. All rights reserved. 


Fibre 


CA/CDCA (TBMca) 


ie ie gee | REA ae | OM 
Ps we <4 ihe awe OW Ns 


Short-chain 
fatty acids 


Seine 
Saturated fats 


@ Glucagon-like 


ee 
& ®@ peptide 1 


Insulin 
resistance 


Pancreas 


Notably, high diversity in the microbiota corresponds with high 
levels of short-chain fatty acid production in rural farmers in Burkina 
Faso””, as well as with the enrichment of genes in the microbiome of 
hunter-gatherers that are associated with the metabolism of complex 
carbohydrates”. In a multigenerational study in mice, the consumption 
of a Western-style diet exacerbated the loss of microbiota diversity com- 
pared with a diet that was rich in microbiota-accessible carbohydrates, 
and the extinction of taxa corresponded with a predicted loss in diver- 
sity of glycoside hydrolases”. Several studies in humans indicate that 
there is a population-specific ‘ceiling’ on microbiota diversity and meta- 
bolic output. For example, following a vegan diet for at least 6 months 
or a high-fibre-low-fat diet for 10 days were insufficient to substantially 
increase microbiota diversity or production of faecal short-chain fatty 
acids™. A plant-based diet could significantly alter the composition of 
the gut microbiota, although a change in diversity was not observed”. 
When fed high levels of resistant starch, individuals who fail to show 
a bloom in Ruminococcus bromii and its relatives also have the highest 
levels of undigested starch in their stool, which supports the idea that 
the composition of the microbiota determines whether a carbohydrate 
is accessible to the microbiota”. Overall, these data suggest that the 
production of short-chain fatty acids is affected by the existing diversity 
within a microbiota. 

Eating whole grains for just 3 days can improve tolerance to glucose in 
some people, and these ‘responders’ show an increased representation 
of specific glycoside hydrolases within the gut microbiome compared 
with non-responders who received the same dietary intervention”. This 
indicates that the microbiota might need to already have the capacity to 
degrade certain complex carbohydrates in the diet to reap the potential 
benefits of microbiota-accessible carbohydrates. Notably, individuals 


Chylomicron 


Lipopolysaccharide 


REVIEW 


Figure 2 | Mechanisms of signalling from the 
gut microbiota to the host. The gut microbiota 
interacts with dietary components and metabolites 
to form bioactive metabolites that signal to the 
host through distinct mechanisms. Short-chain 
fatty acids that are produced by the fermentation 
of fibre are an important source of energy (ATP) 
for colonocytes. They are also a substrate for 
gluconeogenesis, which modulates central 
metabolism, and are involved in signalling to the 
host by inhibiting histone deacetylase (HDAC) 
or by activating G-protein-coupled receptors 
such as GPR41 and GPR43, which triggers the 
release of the hormone glucagon-like peptide-1. 
The primary bile acids cholic acid (CA) and 
chenodeoxycholic acid (CDCA) are metabolized 
into the secondary bile acids deoxycholic acid 
(DCA) and lithocholic acid (LCA), which 
activates signalling to the host through the 
G-protein-coupled bile acid receptor 1 (GPBAR1; 
also known as TGRS5). Tauro-$-muricholic acid 
(TBMCA) is deconjugated into B-muricholic 
acid (8MCA; not shown), which alleviates the 
inhibition of the farnesoid X-activated receptor 
(FXR; also known as the bile acid receptor) by 
TBMCA. Microbially produced endotoxins (also 
known as lipopolysaccharides) are taken up 

into chylomicrons that are formed from dietary 
saturated fats and subsequently they promote 
inflammation in the host that induces insulin 
resistance. L-Carnitine and choline, compounds 
that are found in red meat, are metabolized into 
TMAs that are oxidized further into TMAO by 
the enzyme flavin-containing monooxygenase 3 
(FMO3) in the liver (inset). 


.-Carnitine 
Choline 


whose microbiota and glucose tolerance respond to a whole-grain inter- 
vention tend to consume diets that are higher in fibre. The complex 
carbohydrates that are associated with whole grains and that were meta- 
bolically accessible to the microbiotas of the responders might therefore 
have been inaccessible to non-responders who also did not routinely 
consume high-fibre diets. 


Microbial metabolites 
Microbes that live in the gut continually produce numerous small 
molecules through primary and secondary metabolic pathways™, 
many of which are dependent on the diet of the host. Although some 
of these compounds are retained within the gut ecosystem, others will 
be absorbed into the circulation and then chemically modified (that 
is, co-metabolized) by the host, and eventually secreted in the urine. 
Much research has focused on short-chain fatty acids, which have been 
implicated in diverse roles in obesity and metabolic syndrome. Path- 
ways that generate short-chain fatty acids were found to be enriched 
in metagenomic studies of obesity, and levels of short-chain fatty acids 
were elevated in overweight or obese people and animal models”, 
which is consistent with these products of microbial fermentation pro- 
viding extra calories to the host. By contrast, increased levels of the 
short-chain fatty acid propionate promoted intestinal gluconeogen- 
esis™ or were associated with the microbiota following gastric bypass”, 
which conferred protection from diet-induced obesity on transfer to 
germ-free recipient mice. The direct delivery of propionate to the colon 
through propionate-esterified carbohydrate reduced weight gain in a 
randomized 24-week study of 60 overweight adults”. 

Short-chain fatty acids can signal to the host through at least four 
distinct pathways (Fig. 2). First, short-chain fatty acids, particularly 


7 JULY 2016 | VOL 535 | NATURE | 59 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


butyrate, are an energy substrate for colonocytes’’””, and in response 
to reduced energy availability, germ-free mice slow down the transit 
through the small intestine to allow more time for nutrient absorption”. 
Second, propionate is a substrate for gluconeogenesis and can induce 
intestinal gluconeogenesis, which signals through the central nervous 
system to protect the host from diet-induced obesity and associated 
glucose intolerance®. Third, butyrate and acetate, another short-chain 
fatty acid, can act as histone deacetylase inhibitors’*”*. (Acetate acts in 
peripheral tissues, in which the concentration of butyrate might not 
be high enough to exert an effect.) Fourth, short-chain fatty acids sig- 
nal through G-protein-coupled receptors such as GPR41 (also known 
as FFAR3) and GPR43 (also known as FFAR2), which affects several 
important processes that include inflammation” and enteroendocrine 
regulation’. However, the generation of short-chain fatty acids is only 
one aspect of microbial metabolism in the gut. 

The microbial metabolism of phosphatidylcholine”’, a phospholipid 
that is abundant in cheese, seafood eggs and meat, and of L-carnitine”, 
an amino acid that is abundant in red meat, produce high levels of TMA. 
Once it has been absorbed from the gut into the bloodstream, TMA 
circulates to the liver and is enzymatically oxidized to TMA N-oxide 
(TMAO), a compound that has been associated with poor cardiovas- 
cular outcomes in humans and the acceleration of atherosclerosis in 
mice*”’*” (Fig. 2). TMA production serves as an excellent example 
of the interaction between the diet and the microbiota. For example, 
microbiotas that are capable of producing TMA make the metabolite 
only when compounds that contain trimethyl ammonium are present 
in the diet, and some microbiotas (such as those of vegans) are poor 
producers of TMA”, even when precursor compounds are transiently 
provided through the diet. Together, these data suggest that the micro- 
biota evolves to adapt to specific macronutrients. Many of the experi- 
ments that demonstrated the atherogenic nature of TMAO involved 
supplementing the low-fat diets of animals with the compound. Other 
metabolites probably contribute to metabolic disease — as supported 
by evidence from people who have undergone bariatric surgery, a pro- 
cedure that produces long-term weight loss and improved metabolism 
and reduces the risk of cardiovascular disease and death*”*' but that is 
associated with elevated levels of circulating TMAO”. The increased 
levels of TMAO in such patients might reflect the creation of a more 
aerobic gut environment that is conducive to generation of this metabo- 
lite. It is therefore essential to determine the conditions under which 
TMAO promotes cardiovascular disease and whether TMAO directly 
affects cardiovascular disease in humans. 

Bile acids, formed by the microbiota from host cholesterol, are 
another group of metabolites with a profound effect on human health”. 
They are metabolized by the microbiota in the lower part of the small 
intestine and the colon to generate secondary bile acids*’. They were 
originally thought only to act as soaps that solubilize dietary fats to pro- 
mote their absorption, but over the past two decades, it has become clear 
that they serve as signalling molecules and bind to distinct receptors 
such as G-protein-coupled bile acid receptor 1 (also known as TGR5) 
and the bile acid receptor FXR™ (Fig. 2). The microbiota regulates TGR5 
signalling by producing agonists” and FXR signalling by metaboliz- 
ing antagonists**. TGR5 and FXR both have a major impact on host 
metabolism* and, accordingly, an altered microbiota might affect host 
physiology by modulating the signals that pass through these recep- 
tors. The capacity to metabolize tauro-B-muricholic acid, a naturally 
occurring FXR antagonist**”’, is essential for the microbiota to induce 
obesity and steatosis, as well as impaired tolerance to glucose and insu- 
lin”. At least some of this effect is mediated by an altered microbiota”. 
Bariatric surgery is associated with an altered microbiota and metabo- 
lism of bile acids’*”’. Mechanistic links between bile acids and bariatric 
surgery demonstrate that functional FXR signalling is required for a 
reduction in body weight and an improvement in glucose tolerance 
after vertical sleeve gastrectomy”. Similarly, TGR5 is required for the 
improved metabolism of glucose following this procedure. Germ-free 
mice that received a faecal transplant from people who had undergone 


71,72 


60 | NATURE | VOL 535 | 7 JULY 2016 


Roux-en-Y gastric bypass 10 years earlier gained less fat than did mice 
that were colonized by microbiota from obese people’®. Some of the 
beneficial effects of bariatric surgery might therefore be mediated by the 
altered microbial metabolism of bile acids, which affects their capacity 
for signalling. Other mechanisms and metabolites might have equally 
important roles. 

The microbiota produces a vast number of metabolites and much 
work remains to be done to investigate fully their functions in physi- 
ology and pathophysiology. Examples of such metabolites include: 
ethylpheny] sulfate, which is connected to the exacerbation of autistic 
behaviour in a mouse model”; indole propionic acid, which is linked 
to improved function of the epithelial barrier in the gut”; and indoxyl 
sulfate and p-cresyl sulfate, both of which are associated with poor car- 
diovascular outcomes in people with uraemia (p-cresy] sulfate is also 
associated with insulin resistance)” *’. These metabolites undoubtedly 
give a glimpse of how this poorly explored universe of molecules can 
affect the host. The relevance of some of these metabolites in humans 
is yet to be established. Although several bioactive metabolites are the 
derivatives of amino acids, neither the effect of the quantity and qual- 
ity of protein in the diet on metabolite synthesis nor the ensembles of 
microbial genes that are responsible for metabolite production are well 
understood. 


Inflammation and diet 

Obesity and insulin resistance are associated with the increased infiltra- 
tion of macrophages into and the inflammation of adipose tissue”. 
Because the gut microbiota is known to contribute to the obese phe- 
notype, at least in mice, it might also contribute to increased adipose 
inflammation. A model of adipose inflammation that is dependent on 
the microbiota but independent of diet is supported by evidence from 
Swiss Webster mice. While consuming a standard diet, these animals 
develop a similar amount of adiposity to C57Bl6 common laboratory mice 
that are fed a high-fat diet for 8 weeks. When germ-free Swiss Webster 
and C57Bl6 mice are fed their respective adiposity-inducing diets, both 
exhibit reduced adiposity, lower levels of endotoxins (known as lipopoly- 
saccharides) in the circulation and decreased macrophage infiltration 
into white adipose tissue, as well as improved metabolism of glucose*”®. 
Obesity in mice is also associated with increased numbers of T cells””’” 
and mast cells’® and reduced numbers of regulatory T cells’®”. In mouse 
models, the fermentation of fibre and the generation of short-chain fatty 
acids seem to promote anti-inflammatory responses both within the gut 
and systemically through regulatory T cells’** “°°. Although dietary fibre 
and the production of short-chain fatty acids exert a positive metabolic 
impact through non-immunological mechanisms in a mouse model of 
diet-induced obesity®, it is unclear whether similar interactions that are 
mediated through the immune compartment contribute to metabolic 
changes. The supplementation of high-fat diets with fermentable fibres 
protects mice from obesity and associated diseases'”” but the mechanism 
that underlies this action remains unclear. 

The gut microbiota also interacts with the innate immune system to 
induce adipose inflammation, and mice that lack Toll-like receptor signal- 
ling, through loss of either of the adaptor proteins MyD88 or TRIF (also 
known as TICAM1), have reduced levels of inflammation in adipose tis- 
sue andare protected from insulin resistance that is induced by saturated 
fatty acids“. Mice that are deficient in the gene Myd88, but not the gene 
Trif, are protected from diet-induced obesity, which therefore separates 
obesity from insulin resistance and suggests that they are controlled by 
different mechanisms. Mice raised in conventional conditions that are fed 
saturated fatty acids exhibit increased levels of endotoxins in the circula- 
tion in comparison to mice that consume polyunsaturated fatty acids. 
Dietary fat has been demonstrated to increase the amount of endotoxins 
in the blood plasma of both mice” and humans™, probably by allowing 
endotoxins to be transported across the epithelium on chylomicrons'”°. 
These higher levels of endotoxins activate Toll-like receptors in adipose 
tissue that, in turn, induce the expression of the chemokine CCL2, which 
is required for macrophage infiltration. The source of dietary fat might 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


a Multi-omics-based Data Approach validated in Elucidation of Clinical studies 
analysis relevant model pelaatt 
integration mechanisms 
@ @ —> Biological > : > _ d @ @ @ 
| | samples Hypothesis a Therapeutic 
generation ee > : approaches 
eS i 
Healthy Disease Invitro Organoid ~— Animal e Drugtrials Interventions 
person state 
Machine learning Predictive elements Mechanistic 
b se @ validated in studies 
0 a) Dietary interventions Respee i 7 independent cohort 
e z a <> 
} Non-r = P ; ; 
°Sponder In vitro Organoid Animal 


Multi-omics data 
Clinical metadata 


e Personalized clinical profile 


Figure 3 | Strategies for modulating the gut microbiota to improve 
human health. a, The collection and comparison of multi-omics data 
from healthy people and those who are affected by metabolic disorders 
will implicate various genes, pathways and molecules as potential targets 
for intervention. Relevant experimental models (in vitro, organoid or 
animal models) are then used to elucidate underlying mechanisms and 
to pilot therapeutic approaches to modulating the gut microbiota, which 
lay the foundations for intervention studies or drug trials in humans. b, 
Studies in humans can also be a starting point for the identification of 


therefore have specific interactions with the microbiota that lead to altered 
interactions with the innate immune system and contribute to metabolic 
diseases. Mice fed a diet supplemented with fish oil are protected from 
obesity and insulin resistance. Furthermore, mice that consume lard and 
receive the microbiota of those fed fish oil are protected against obesity, 
which demonstrates that the modified microbiotas themselves have a 
protective effect. 

A switch to a diet rich in saturated fatty acids shifts the composition of 
the microbiota“. Levels of the bacterium Bilophila wadsworthia increase 
when mice are fed a diet rich in milk fat or supplemented with the bile acid 
taurocholic acid’. Similarly, increased levels of Bilophila and a reduced 
abundance of Desulfovibrio were observed in mice that were fed lard 
compared with fish oil as a source of fat“. B. wadsworthia increases gut 
inflammation in mice that lack in the anti-inflammatory cytokine inter- 
leukin (IL)-10. Insulin resistance that is induced through a high-fat diet is 
associated with reduced levels of T helper 17 (T;;17) cells that are positive 
for IL-17 and retinoic acid receptor-related orphan receptor yt (RORyt)'”. 
It is tempting to speculate that one of the underlying mechanisms involves 
the fat-induced restriction of a specific taxon known as the segmented 
filamentous bacteria, which induce the expression of IL-23 in enterocytes. 
IL-23 causes the release of IL-22 from innate lymphoid cells in the ileum, 
which subsequently induces the production of the proteins serum amy- 
loid Al and serum amyloid A2 from the epithelium in a paracrine fashion 
—a process that is required for the activation of T,,17 cells in the ileum’. 
In mice, IL-22 has been shown to protect against metabolic disease, which 
further suggests a link between the altered gut microbiota, T,,17 cells and 
IL-22 signalling and the mediation of metabolic disease. However, it is 
unknown whether taxa that induce specific immune responses, such as 
the segmented filamentous bacteria in mice, protect against metabolic 
disease. Despite efforts, there are no reports on the role of segmented fila- 
mentous bacteria in people, but other bacteria in the human microbiota 
might have developed similar functions. 


Dietary interventions and diet-based therapeutics 

The gut microbiota provides a powerful route to influencing human 
health. It has many attributes with biomedical potential, such as a 
connection to multiple facets of human biology, malleability and 
accessibility for therapeutic targeting or diagnostics. The microbes 
of the gut can therefore be likened to an easily accessible control 
centre for the modulation of human physiology. However, owing to 


Implementation in 
diet of patients 


strategies to modulate the gut microbiota through components of the diet, 
which are generally considered to be ‘safe’ interventions. Data-processing 
algorithms, such as machine learning, can be used to identify aspects of the 
clinical profile of individuals (including data on the microbiota) that help 
to predict the response of others to dietary interventions. After validation 
of these predictive elements in independent cohorts, the best intervention 
can be determined and then implemented to improve human health. 

Such predictive elements can also be used to guide mechanistic studies in 
experimental models. 


the complexity and individuality of each microbiota, the rate at which 
this potential can be realized is unknown. 

Diet and, in particular, polysaccharides serve as primary modula- 
tors of the composition and function of the microbiota. Polysaccha- 
rides, which are widely consumed components of human food, are 
therefore functionally analogous to small-molecule drugs. Because 
of their relative safety (that is, their lack of acute toxicity), availability 
and low cost, it might be feasible systematically and empirically to 
determine which dietary polysaccharides, alone or in combinations, 
can improve human health in different situations. 

Such an empirical approach is compatible with emerging concepts 
in precision health””'™. Although the dietary interventions affect the 
metabolic responses of hosts in an individualized manner, elements 
of the microbiome can help to predict the response. One study used 
continuous blood-glucose monitoring to follow postprandial gly- 
caemic responses in 800 people’. The responses of individuals to 
particular foods were highly variable. However, when compared 
with microbiome profiles and with measurements of metabolism 
and behaviour, using a machine-learning approach, the response 
of an individual to a given food could be predicted — even in an 
independent cohort. Similarly, individuals show large differences in 
glucose metabolism in response to an intervention that is based on 
whole grains”. Improved tolerance to glucose could be explained 
largely through enrichment of the genus Prevotella within the micro- 
biota. Prevotella could also improve the glucose metabolism of mice 
that were fed carbohydrate-rich diets but not a high-fat diet that was 
devoid of fermentable polysaccharides. These findings point to the 
possibility of a mechanism-free, empirical approach for determin- 
ing a dietary intervention that is appropriate for a given individual 
or group. They also highlight the potential of a next generation of 
probiotics (sets of microbiota-derived living microbes that will be 
tailored to interact with a given diet) as a method for converting 
non-responders into responders. A further outcome of this approach 
might be the use of predictive elements of metadata to guide the gen- 
eration of hypotheses and to determine priorities for investigation 
into underlying mechanisms. 


Perspective 
It is becoming clear that an altered gut microbiota is associated with 


metabolic diseases in humans that range from obesity to type 2 diabetes 


7 JULY 2016 | VOL 535 | NATURE | 61 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


and cardiovascular disease. Causality has also been demonstrated in 
animal models. To move forwards, it will be essential to understand 
whether the gut microbiota is causally linked to host metabolism in 
humans. Prospective studies should be performed to determine whether 
the gut microbiota is altered before or after the onset of disease. This will 
require large cohorts that allow considerable numbers of participants 
to develop the disease under investigation, and it will probably involve 
the high-resolution monitoring of host and microbial parameters to 
determine the progression of derangements. 

Another approach is to transfer microbiotas from humans to mice, 
and this is particularly powerful when focused on twin cohorts to con- 
trol for human genetics'*’**. In one-such study, transplantation of the 
microbiota from obese individuals to germ-free mice transfers the obese 
phenotype, as determined by increased weight gain, whereas adminis- 
tration of Christensenella minuta prevents weight gain’>. In a separate 
study, bacterial representatives from the microbiota of lean individu- 
als were associated with an increased production of short-chain fatty 
acids, whereas the microbiota of obese individuals had an increased 
abundance of genes that are involved in biosynthesis of branched-chain 
amino acids, which are associated with impaired sensitivity to insu- 
lin“. Importantly, the lean microbiota could only invade and prevent 
increased adiposity when the recipient mice consumed a diet that was 
low in fat and high in fruits and vegetables. Consistent with the idea 
that the microbiota reinforces the diet, supplementation with Prevotella 
produces an improved tolerance to glucose only when mice are fed a 
standard diet that is rich in fibre, and not a Western-style diet, which 
is devoid of fibre”. 

A similar dependency on diet was observed in children with a type 
of malnutrition known as kwashiorkor’». Twins that are discordant for 
kwashiorkor have distinct microbiotas, and germ-free mice that have 
been colonized with a ‘kwashiorkor’ microbiota experience weight loss 
when they are fed a typical Malawian diet, which is based on tomatoes 
and corn. However, when the mice are fed a peanut-based, ‘ready-to-use’ 
therapeutic food, their weight transiently increases and their microbiota 
normalize’. It is becoming increasingly important to consider how the 
diet can modify microbiota-linked disease states in mice to generate 
hypotheses about underlying molecular mechanisms that can then be 
tested and validated in people. Faecal microbiota transplantation, which 
has been shown to cure recurrent infection with Clostridium difficile’, 
has also been used to directly address whether the gut microbiota can 
affect the metabolism of the host. Eighteen insulin-resistant obese men 
were randomly designated to receive either an autologous (control) fae- 
cal microbiota transplant or a similar transplant from a lean, insulin- 
sensitive donor. Insulin clamps that were performed before and after 
the intervention revealed that the insulin sensitivity of a subset of the 
participants had significantly improved 6 weeks after the transplant’”. 
It is unclear whether the positive responses of these individuals are 
dependent on characteristics of the donors or the recipients as well as 
what the duration of the responses should be. Research in larger cohorts 
is required to verify the effects of faecal microbiota transplantion and 
to answer remaining questions. For example, experiments could be 
performed using specific bacteria from lean microbiotas with the aim 
of developing next-generation probiotics. It is clear that stratification 
might be required to identify groups that are likely to respond to such 
interventions”. 

To improve the understanding of how the microbiota affects the 
metabolism in humans, metagenomics, transcriptomics, proteom- 
ics and metabolomics data from key target tissues and the microbiota 
during various disease states and interventions should be combined to 
provide a map of co-occurrences. These data enable the formation of 
testable hypotheses that can be pursued in validated animal models, 
and they will form the foundation for precision interventions (Fig. 3). 

It will also be important to gain a more nuanced understanding of the 
foundational principles of the microbiota, such as the cross-sectional or 
longitudinal spatial organization of interactions between the host and its 
microbes in the intestine’. The majority of studies in humans and mice 


62 | NATURE | VOL 535 | 7 JULY 2016 


© 2016 Macmillan Publishers 


rely on faecal samples, which provide some representation of what is 
occurring throughout the digestive tract; however, aspects of microbial 
communities and host responses that are specific to the small intestine 
might be obscured by faecal sampling'’”””’. For example, it could miss 
information on how the microbiota affects nutrient absorption in the 
small intestine through its impact on glucose transporters and bile acids, 
which are essential for the absorption of lipids and fat-soluble vitamins. 

Microbial metabolites probably act as mediators for the host metabo- 
lism and can be either beneficial (for example, butyrate) or detrimental 
(TMAO). Such molecules might therefore provide fresh therapeutic 
approaches in which beneficial metabolites could be supplemented 
pharmacologically or the bacteria that produce them are developed into 
probiotics. And receptor antagonists could be developed from detri- 
mental metabolites if the relevant receptor has been identified. Another 
possibility is to target the microbial enzymes that produce metabolites 
with inhibitors. An inhibitor of TMA lyase that stops the microbial 
synthesis of TMA and therefore reduces the levels of circulating TMAO 
prevents the development of atherosclerosis in mice’*’. However, such 
inhibitors are yet to be tested in humans, and it is unlikely that one 
metabolite acts alone to promote or prevent metabolic diseases. Strate- 
gies that promote or prevent suites of metabolites are more likely to have 
wider applicability and larger effects on host metabolism. 

It is reasonable to consider what proportion of metabolic problems in 
humans could be addressed by properly caring for the gut microbiota. 
The use of antibiotics in early life is associated with obesity in both 
people and mice, which suggests that the disruption of microbial eco- 
systems at crucial points in time might affect physiology in later life and 
also that the amendment of medical practices could have a substantial 
impact™'””. However, changes in the diet might be more important for 
reaping the health benefits that the microbiota can provide. Increased 
levels of polysaccharides are likely to be of benefit to people who fol- 
low a typical Western-style diet, most of whom consume far below the 
recommended amounts of dietary fibre“; meta-analyses show that 
the increased consumption of fibre significantly decreases the risk of 
mortality’**, Controlled dietary interventions that document the util- 
ity of various supplements, probiotics, nutrients and foods in modu- 
lating aspects of the gut microbiota and human health are required. 
The measurement of multiple aspects of individuality, including the 
microbiota, will provide insight into the characteristics of people who 
respond beneficially to a given intervention and will pave the way for 
microbiota-focused precision nutrition. A deeper understanding of the 
gut microbiota, an important aspect of failing health, has the potential 
to contribute big gains in our understanding of metabolic health and 
weight loss. m 


Received 27 January; accepted 22 April 2016. 


1. Mooradian, A. D. Dyslipidemia in type 2 diabetes mellitus. Nature Clin. Pract. 
Endocrinol. Metab. 5, 150-159 (2009). 

2. Eckburg, P.B. et al. Diversity of the human intestinal microbial flora. Science 
308, 1635-1638 (2005). 

3. The Human Microbiome Project Consortium. Structure, function and diversity 
of the healthy human microbiome. Nature 486, 207-214 (2012). 

4. Ley, R. E., Turnbaugh, P. J., Klein, S. & Gordon, J. |. Microbial ecology: human gut 

microbes associated with obesity. Nature 444, 1022-1023 (2006). 

5. Qin, J. et a/. A human gut microbial gene catalogue established by 

metagenomic sequencing. Nature 464, 59-65 (2010). 

6. Li, J. eta/. An integrated catalog of reference genes in the human gut 

microbiome. Nature Biotechnol. 32, 834-841 (2014). 

7. Backhed, F. et al. The gut microbiota as an environmental factor that regulates 

fat storage. Proc. Natl Acad. Sci. USA 101, 15718-15723 (2004). 

8. Backhed, F., Manchester, J. K., Semenkovich, C. F. & Gordon, J. |. Mechanisms 

underlying the resistance to diet-induced obesity in germ-free mice. Proc. Natl 

Acad. Sci. USA 104, 979-984 (2007). 

9. Rabot, S. et al. Germ-free C57BL/6J mice are resistant to high-fat-diet-induced 

insulin resistance and have altered cholesterol metabolism. FASEB J. 24, 

4948-4959 (2010). 

10. Ding, S. et a/. High-fat diet: bacteria interactions promote intestinal 

inflammation which precedes and correlates with obesity and insulin resistance 

in mouse. PLoS ONE 5, 12191 (2010). 

1. Ley, R.E. eta/. Obesity alters gut microbial ecology. Proc. Natl Acad. Sci. USA 

102, 11070-11075 (2005). 
12. Turnbaugh, P. J. et al. An obesity-associated gut microbiome with increased 


Limited. All rights reserved. 


13. 


14. 


20. 


21. 


22. 


23. 


24. 


25. 


26. 


27. 


28. 


29: 


30. 


31. 


32: 


33. 


34. 


30: 


36. 


37; 


38. 


39. 


40. 


41. 


42. 


43. 


44. 


45. 


capacity for energy harvest. Nature 444, 1027-1031 (2006). 

The first study to show that the microbiota from an obese mouse could confer 
increased weight gain to a germ-free recipient mouse. 

Le Chatelier, E. et a/. Richness of human gut microbiome correlates with 
metabolic markers. Nature 500, 541-546 (2013). 

Ridaura, V. K. et al. Gut microbiota from twins discordant for obesity modulate 
metabolism in mice. Science 341, 1241214 (2013). 

This study showed that a microbiota from a lean individual could invade the 
microbiota of an obese individual and provide protection from weight gain, 
but that the invasion and protection was dependent on diet. 


. Goodrich, J. K. et al. Human genetics shape the gut microbiome. Cel/ 159, 


789-799 (2014). 


. Tremaroli, V. et al. Roux-en-Y gastric bypass and vertical banded gastroplasty 


induce long-term changes on the human gut microbiome contributing to fat 
mass regulation. Cell Metab. 22, 228-238 (2015). 


. Turnbaugh, P. J. et a/. A core gut microbiome in obese and lean twins. Nature 


457, 480-484 (2009). 


. David, L. A. et al. Diet rapidly and reproducibly alters the human gut 


microbiome. Nature 505, 559-563 (2014). 


. De Filippo, C. et al. Impact of diet in shaping gut microbiota revealed by a 


comparative study in children from Europe and rural Africa. Proc. Nat! Acad. Sci. 
USA 107, 14691-14696 (2010). 

The first of several studies to show that the gut microbiota of a traditional 
rural population is more diverse than and contains distinct taxa in comparison 
to the microbiotas of Western populations. 

Wu, G. D. et al. Linking long-term dietary patterns with gut microbial 
enterotypes. Science 334, 105-108 (2011). 

Cotillard, A. et a/. Dietary intervention impact on gut microbial gene richness. 
Nature 500, 585-588 (2013). 

Kovatcheva-Datchary, P. et al. Dietary fiber-induced improvement in glucose 
metabolism is associated with increased abundance of Prevotella. Cell Metab. 
22, 971-982 (2015). 

Walker, A. W. et a/. Dominant and diet-responsive groups of bacteria within the 
human colonic microbiota. [SME J. 5, 220-230 (2011). 

Ley, R. E. et al. Evolution of mammals and their gut microbes. Science 320, 
1647-1651 (2008). 

Muegge, B. D. et al. Diet drives convergence in gut microbiome functions across 
mammalian phylogeny and within humans. Science 332, 970-974 (2011). 
Torrey, J. C. The regulation of the intestinal flora of dogs through diet. J. Med. 
Res. 39, 415-447 (1919). 

Koeth, R. A. et a/. Intestinal microbiota metabolism of L-carnitine, a nutrient in 
red meat, promotes atherosclerosis. Nature Med. 19, 576-585 (2013). 

Wu, G. D. et al. Comparative metabolomics in vegans and omnivores reveal 
constraints on diet-dependent gut microbiota metabolite production. Gut 65, 
63-72 (2016). 

Hehemann, J. H. et al. Transfer of carbohydrate-active enzymes from marine 
bacteria to Japanese gut microbiota. Nature 464, 908-912 (2010). 

Wexler, A. G. et a/. Human symbionts inject and neutralize antibacterial toxins to 
persist in the gut. Proc. Nat! Acad. Sci. USA 113, 3639-3644 (2016). 
Chatzidaki-Livanis, M., Geva-Zatorsky, N. & Comstock, L. E. Bacteroides 

fragilis type VI secretion systems use novel effector and immunity proteins to 
antagonize human gut Bacteroidales species. Proc. Natl Acad. Sci. USA 113, 
3627-3632 (2016). 

Lee, S. M. et al. Bacterial colonization factors control specificity and stability of 
the gut microbiota. Nature 501, 426-429 (2013). 

Reyes, A., Wu, M., McNulty, N. P., Rohwer, F. L. & Gordon, J. |. Gnotobiotic mouse 
model of phage—bacterial host dynamics in the human gut. Proc. Natl Acad. Sci. 
USA 110, 20236-20241 (2013). 

Wostmann, B. S., Larkin, C., Moriarty, A. & Bruckner-Kardoss, E. Dietary intake, 
energy metabolism, and excretory losses of adult male germfree Wistar rats. 
Lab. Anim. Sci. 33, 46-50 (1983). 

Lozupone, C. A. et a/. The convergence of carbohydrate active gene repertoires 
in human gut microbes. Proc. Nat! Acad. Sci. USA 105, 15076-15081 (2008). 
El Kaoutari, A., Armougom, F., Gordon, J. |., Raoult, D. & Henrissat, B. The 
abundance and variety of carbohydrate-active enzymes in the human gut 
microbiota. Nature Rev. Microbiol. 11, 497-504 (2013). 

McNeil, N. |. The contribution of the large intestine to energy supplies in man. 
Am. J. Clin. Nutr. 39, 338-342 (1984). 

Bergman, E. N. Energy contributions of volatile fatty acids from the 
gastrointestinal tract in various species. Physiol. Rev. 70, 567-590 (1990). 
Bingham, S. & Cummings, J. H. in Medical Aspects of Dietary Fiber (eds Spiller, G. 
A. & Kay, R. M.) 261-2884 (Plenum, 1980). 

Schnorr, S. L. et a/. Gut microbiome of the Hadza hunter-gatherers. Nature 
Commun. 5, 3654 (2014). 

Sommer, F. et a/. The gut microbiota modulates energy metabolism in the 
hibernating brown bear Ursus arctos. Cell Rep. 14, 1655-1661 (2016). 

Koren, O. et al. Host remodeling of the gut microbiome and metabolic changes 
during pregnancy. Cel/ 150, 470-480 (2012). 

Turnbaugh, P. J., Backhed, F., Fulton, L. & Gordon, J. |. Diet-induced obesity is 
inked to marked but reversible alterations in the mouse distal gut microbiome. 
Cell Host Microbe 3, 213-223 (2008). 

Caesar, R., Tremaroli, V., Kovatcheva-Datchary, P., Cani, P. D. & Backhed, F. 
Crosstalk between gut microbiota and dietary lipids aggravates WAT 
inflammation through TLR signaling. Cell Metab. 22, 658-668 (2015). 
Fleissner, C. K. et al. Absence of intestinal microbiota does not protect mice 
from diet-induced obesity. Br. J. Nutr. 104, 919-929 (2010). 


46. 


47. 


48. 


49. 


50. 


51, 


52. 


53. 


54. 


55. 


56. 


57. 


58. 


59. 


60. 


61. 


62. 


63. 


64. 


65. 


66. 


67. 


68. 


69. 


70. 


71. 


72. 


73: 


74. 


75. 


76. 


Us 


78. 


79. 


80. 


81. 


82. 


REVIEW 


Chassaing, B. et al. Dietary emulsifiers impact the mouse gut microbiota 
promoting colitis and metabolic syndrome. Nature 519, 92-96 (2015). 

Suez, J. et al. Artificial sweeteners induce glucose intolerance by altering the gut 
microbiota. Nature 514, 181-186 (2014). 

McGill, C. R., Fulgoni, V. L. Ill & Devareddy, L. Ten-year trends in fiber and whole 
grain intakes and food sources for the United States population: National 
Health and Nutrition Examination Survey 2001-2010. Nutrients 7, 1119-1130 
(2015). 
Costello, E. K., Stagaman, K., Dethlefsen, L., Bohannan, B. J. & Relman, D. A. 

The application of ecological theory toward an understanding of the human 
microbiome. Science 336, 1255-1262 (2012). 

Cardinale, B. J. et al. Biodiversity loss and its impact on humanity. Nature 486, 
59-67 (2012). 

Turnbaugh, P. J. et a/. A core gut microbiome in obese and lean twins. Nature 
457, 480-484 (2009). 

Yatsunenko, T. et a/. Human gut microbiome viewed across age and geography. 
Nature 486, 222-227 (2012). 

Obregon-Tito, A. J. et al. Subsistence strategies in traditional societies 
distinguish gut microbiomes. Nature Commun. 6, 6505 (2015). 

Martinez, |. et a/. The gut microbiota of rural Papua New Guineans: composition, 
diversity patterns, and ecological processes. Cell Rep. 11, 527-538 (2015). 
Clemente, J. C. et a/. The microbiome of uncontacted Amerindians. Sci. Adv. 1, 
e1500183 (2015). 

Forslund, K. et al. Country-specific antibiotic use practices impact the human 
gut resistome. Genome Res. 23, 1163-1169 (2013). 

Karlsson, F. H. et a/. Gut metagenome in European women with normal, 
impaired and diabetic glucose control. Nature 498, 99-103 (2013). 

Qin, J. et al. A metagenome-wide association study of gut microbiota in type 2 
diabetes. Nature 490, 55-60 (2012). 
Srinivasan, S. et al. Bacterial communities in women with bacterial vaginosis: 
high resolution phylogenetic analyses reveal relationships of microbiota to 
clinical criteria. PLoS ONE 7, e37818 (2012). 
Martens, E. C., Kelly, A. G., Tauzin, A. S. & Brumer, H. The devil lies in the details: 
how variations in polysaccharide fine-structure impact the physiology and 
evolution of gut microbes. J. Mol. Biol. 426, 3851-3865 (2014). 

Sonnenburg, E. D. & Sonnenburg, J. L. Starving our microbial self: the 
deleterious consequences of a diet deficient in microbiota-accessible 
carbohydrates. Cel! Metab. 20, 779-786 (2014). 

Rampelli, S. et al. Metagenome sequencing of the Hadza hunter-gatherer gut 
microbiota. Curr Biol. 25, 1682-1693 (2015). 

Sonnenburg, E. D. et al. Diet-induced extinctions in the gut microbiota 
compound over generations. Nature 529, 212-215 (2016). 

Donia, M. S. & Fischbach, M. A. Small molecules from the human microbiota. 
Science 349, 1254766 (2015). 

Meyer, T. W. & Hostetter, T. H. Uremic solutes from colon microbes. Kidney Int. 
81, 949-954 (2012). 

Cho, I. et al. Antibiotics in early life alter the murine colonic microbiome and 
adiposity. Nature 488, 621-626 (2012). 

This study demonstrated that the use of antibiotics in early life might cause 
metabolic disease in later life. 

Schwiertz, A. et al. Microbiota and SCFA in lean and overweight healthy 
subjects. Obesity 18, 190-195 (2010). 

De Vadder, F. et al. Microbiota-generated metabolites promote metabolic 
benefits via gut-brain neural circuits. Ce// 156, 84-96 (2014). 

Liou, A. P. et al. Conserved shifts in the gut microbiota due to gastric bypass 
reduce host weight and adiposity. Sci Trans/. Med. 5, 178ra41 (2013). 
Chambers, E. S. et a/. Effects of targeted delivery of propionate to the human 
colon on appetite regulation, body weight maintenance and adiposity in 
overweight adults. Gut 64, 1744-1754 (2015). 

Donohoe, D. R. et al. The microbiome and butyrate regulate energy metabolism 
and autophagy in the mammalian colon. Cell Metab. 13, 517-526 (2011). 
Donohoe, D. R., Wali, A., Brylawski, B. P. & Bultman, S. J. Microbial regulation 

of glucose metabolism and cell-cycle progression in mammalian colonocytes. 
PLoS ONE 7, e46589 (2012). 

Wichmann, A. et a/. Microbial modulation of energy availability in the colon 
regulates intestinal transit. Cell Host Microbe 14, 582-590 (2013). 

Thorburn, A. N. et a/. Evidence that asthma is a developmental origin disease 
influenced by maternal diet and bacterial metabolites. Nature Commun. 6, 
7320 (2015). 
Davie, J. R. Inhibition of histone deacetylase activity by butyrate. J. Nutr. 133, 
2485S-2493S (2003). 

Maslowski, K. M. et a/. Regulation of inflammatory responses by gut microbiota 
and chemoattractant receptor GPR43. Nature 461, 1282-1286 (2009). 
Samuel, B.S. et al. Effects of the gut microbiota on host adiposity are 
modulated by the short-chain fatty-acid binding G protein-coupled receptor, 
Gpr41. Proc. Nat! Acad. Sci. USA 105, 16767-16772 (2008). 

Wang, Z. et al. Gut flora metabolism of phosphatidylcholine promotes 
cardiovascular disease. Nature 472, 57-63 (2011). 

Tang, W. H. et al. Intestinal microbial metabolism of phosphatidylcholine and 
cardiovascular risk. N. Engl. J. Med. 368, 1575-1584 (2013). 

Sjéstrém, L. et al. Effects of bariatric surgery on mortality in Swedish obese 
subjects. N. Engl. J. Med. 357, 741-752 (2007). 

Sjéstrém, L. et a/. Bariatric surgery and long-term cardiovascular events. J. Am. 
Med. Assoc. 307, 56-65 (2012). 

Russell, D. W. The enzymes, regulation, and genetics of bile acid synthesis. 
Annu. Rev. Biochem. 72, 137-174 (2003). 


7 JULY 2016 | VOL 535 | NATURE | 63 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


83. Midtvedt, T. Microbial bile acid transformation. Am. J. Clin. Nutr. 27, 1341-1347 
(1974). 

84. Thomas, C., Pellicciari, R., Pruzanski, M., Auwerx, J. & Schoonjans, K. Targeting 
bile-acid signalling for metabolic diseases. Nature Rev. Drug Discov. 7, 678-693 
(2008). 

85. Kawamata, Y. et al. A G protein-coupled receptor responsive to bile acids. J. Biol. 
Chem. 278, 9435-9440 (2003). 

86. Sayin, S. |. et al. Gut microbiota regulates bile acid metabolism by reducing the 
levels of tauro-beta-muricholic acid, a naturally occurring FXR Antagonist. Cel! 
Metab. 17, 225-235 (2013). 

87. Li, F. etal. Microbiome remodelling leads to inhibition of intestinal farnesoid X 
receptor signalling and decreased obesity. Nature Commun. 4, 2384 (2013). 

88. Jiang, C. et al. Intestinal farnesoid X receptor signaling promotes nonalcoholic 
fatty liver disease. J. Clin. Invest. 125, 386-402 (2015). 

89. Parséus, A. et al. Microbiota-induced obesity requires farnesoid X receptor. Gut 
http://dx.doi.org/10.1136/gutjnI-2015-310283 (2016). 

90. Ryan, K. K. et a/. FXR is a molecular target for the effects of vertical sleeve 
gastrectomy. Nature 509, 183-188 (2014). 

91. Hsiao, E. Y. et al. Microbiota modulate behavioral and physiological 
abnormalities associated with neurodevelopmental disorders. Ce// 155, 
1451-1463 (2013). 

92. Venkatesh, M. et al. Symbiotic bacterial metabolites regulate gastrointestinal 
barrier function via the xenobiotic sensor PXR and Toll-like receptor 4. Immunity 
41, 296-310 (2014). 

93. Meijers, B. K. et al. p-Cresol and cardiovascular risk in mild-to-moderate kidney 
disease. Clin. J. Am. Soc. Nephrol. 5, 1182-1189 (2010). 

94. Koppe, L. et al. p-Cresyl sulfate promotes insulin resistance associated with 
CKD. J. Am. Soc. Nephrol. 24, 88-99 (2013). 

95. Barreto, F.C. et al. Serum indoxyl sulfate is associated with vascular disease 
and mortality in chronic kidney disease patients. Clin. J. Am. Soc. Nephrol. 4, 
1551-1558 (2009). 

96. Weisberg, S. P. et al. Obesity is associated with macrophage accumulation in 
adipose tissue. J. Clin. Invest. 112, 1796-1808 (2003). 

97. Xu, H. et al. Chronic inflammation in fat plays a crucial role in the development 
of obesity-related insulin resistance. J. Clin. Invest. 112, 1821-1830 (2003). 

98. Caesar, R. et a/. Gut-derived lipopolysaccharide augments adipose macrophage 
accumulation but is not essential for impaired glucose or insulin tolerance in 
mice. Gut 61, 1701-1707 (2012). 

99. Winer, S. et al. Normalization of obesity-associated insulin resistance through 
immunotherapy. Nature Med. 15, 921-929 (2009). 

100.Nishimura, S. et al. CD8* effector T cells contribute to macrophage recruitment 
and adipose tissue inflammation in obesity. Nature Med. 15, 914-920 (2009). 

101.Liu, J. et al. Genetic deficiency and pharmacological stabilization of mast cells 
reduce diet-induced obesity and diabetes in mice. Nature Med. 15, 940-945 
(2009). 

102.Feuerer, M. et al. Lean, but not obese, fat is enriched for a unique population of 
regulatory T cells that affect metabolic parameters. Nature Med. 15, 930-939 
(2009). 

103.Smith, P. M. et al. The microbial metabolites, short-chain fatty acids, regulate 
colonic Ty, cell homeostasis. Science 341, 569-573 (2013). 

104. Furusawa, Y. et al. Commensal microbe-derived butyrate induces the 
differentiation of colonic regulatory T cells. Nature 504, 446-450 (2013). 

105.Trompette, A. et al. Gut microbiota metabolism of dietary fiber influences 
allergic airway disease and hematopoiesis. Nature Med. 20, 159-166 (2014). 

106.Arpaia, N. et a/. Metabolites produced by commensal bacteria promote 
peripheral regulatory T-cell generation. Nature 504, 451-455 (2013). 

107.Cani, P. D. et al. Selective increases of bifidobacteria in gut microflora improve 
high-fat-diet-induced diabetes in mice through a mechanism associated with 
endotoxaemia. Diabetologia 50, 2374-2383 (2007). 

108.Cani, P. D. et a/. Metabolic endotoxemia initiates obesity and insulin resistance. 
Diabetes 56, 1761-1772 (2007). 
The first study to demonstrate that the presence of endotoxin is sufficient to 
alter glucose metabolism in mice. 

109.Erridge, C., Attina, T., Spickett, C. M. & Webb, D. J. A high-fat meal induces 
low-grade endotoxemia: evidence of a novel mechanism of postprandial 
inflammation. Am. J. Clin. Nutr. 86, 1286-1292 (2007). 

110.Ghoshal, S., Witta, J., Zhong, J., de Villiers, W. & Eckhardt, E. Chylomicrons 
promote intestinal absorption of lipopolysaccharides. J. Lipid Res. 50, 90-97 
(2009). 


64 | NATURE | VOL 535 | 7 JULY 2016 


111.Devkota, S. et al. Dietary-fat-induced taurocholic acid promotes pathobiont 
expansion and colitis in //10~ mice. Nature 487, 104-108 (2012). 

112.Garidou, L. et al. The gut microbiota regulates intestinal CD4 T cells expressing 
RORyt and controls metabolic disease. Cel/ Metab. 22, 100-112 (2015). 

113.Sano, T. et al. An IL-23R/IL-22 circuit regulates epithelial serum amyloid A to 
promote local effector Th17 responses. Cel! 163, 381-393 (2015). 

114.Zeevi, D. et al. Personalized nutrition by prediction of glycemic responses. Cell 
163, 1079-1094 (2015). 
This study used a machine-learning approach to mine personal health 
profiles that included microbiome data to predict the postprandial glycaemic 
response. 

115.Smith, M. |. et al. Gut microbiomes of Malawian twin pairs discordant for 
kwashiorkor. Science 339, 548-554 (2013). 

116.van Nood, E. et a/. Duodenal infusion of donor feces for recurrent Clostridium 
difficile. N. Engl. J. Med. 368, 407-415 (2013). 

117.Vrieze, A. et al. Transfer of intestinal microbiota from lean donors increases 
insulin sensitivity in individuals with metabolic syndrome. Gastroenterology 
143, 913-916 (2012). 
This study demonstrated that sensitivity to insulin could be changed by 
directly altering the gut microbiota through faecal microbiota transplantation. 

118.Earle, K. A. et a/. Quantitative imaging of gut microbiota spatial organization. 
Cell Host Microbe 18, 478-488 (2015). 

119.Lichtman, J. S. et al. The effect of microbial colonization on the host proteome 
varies by gastrointestinal location. ISME J. 10, 1170-1181 (2016). 

120.Turnbaugh, P. J. et al. The effect of diet on the human gut microbiome: a 
metagenomic analysis in humanized gnotobiotic mice. Sci. Trans/. Med. 1, 
6ral4 (2009). 

121.Wang, Z. et al. Non-lethal inhibition of gut microbial trimethylamine production 
for the treatment of atherosclerosis. Ce// 163, 1585-1595 (2015). 
The first example of inhibiting microbial enzymes (or ‘drugging the bug’) to 
prevent atherosclerosis. 

122.Ajslev, T.A., Andersen, C. S., Gamborg, M., Sorensen, T. |. & Jess, T. Childhood 
overweight after establishment of the gut microbiota: the role of delivery mode, 
pre-pregnancy weight and early administration of antibiotics. Int. J. Obes. 35, 
522-529 (2011). 

123.Kim, Y. & Je, Y. Dietary fiber intake and total mortality: a meta-analysis of 
prospective cohort studies. Am. J. Epidemiol. 180, 565-573 (2014). 

124.Yang, Y., Zhao, L. G., Wu, Q. J., Ma, X. & Xiang, Y. B. Association between dietary 
fiber and lower risk of all-cause mortality: a meta-analysis of cohort studies. 
Am. J. Epidemiol. 181, 83-91 (2015). 

125.Salonen, A. et al. Impact of diet and individual variation on intestinal microbiota 
composition and fermentation products in obese men. ISME J. 8, 2218-2230 
(2014). 

126.Walker, A. W., Duncan, S. H., McWilliam Leitch, E. C., Child, M. W. & Flint, H. J. pH 
and peptide supply can radically alter bacterial populations and short-chain 
fatty acid ratios within microbial communities from the human colon. Appl. 
Environ. Microbiol. 71, 3692-3700 (2005). 

127.Chung, W. S. et al. Modulation of the human gut microbiota by dietary fibres 
occurs at the species level. BMC Biol. 14, 3 (2016). 

128.Bown, R. L., Gibson, J. A., Sladen, G. E., Hicks, B. & Dawson, A. M. Effects 
of lactulose and other laxatives on ileal and colonic pH as measured by a 
radiotelemetry device. Gut 15, 999-1004 (1974). 

129. Kettle, H., Louis, P., Holtrop, G., Duncan, S. H. & Flint, H. J. Modelling the 
emergent dynamics and major metabolites of the human colonic microbiota. 
Environ. Microbiol. 17, 1615-1630 (2015). 


Acknowledgements The authors thank members of the Sonnenburg and Backhed 
laboratories for discussions. This work was funded by a grant from the US National 
Institute of Diabetes and Digestive and Kidney Diseases NIDDK (RO1-DK085025 
to J.L.S.) and grants from the Swedish Research Council and the Novo Nordisk 
Foundation to F.B. F.B. is a recipient of a European Research Council Consolidator 
Grant (615362-METABASE). 


Author information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of this paper 
at go.nature.com/28j4ikq. Correspondence should be addressed to J.L.S. 
(jsonnenburg@stanford.edu) and F.B. (fredrik.backhed@wlab.gu.se). 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


doi:10.1038/nature18847 


The microbiome and 
innate immunity 


Christoph A. Thaiss’*, Niv Zmora’?**, Maayan Levy” & Eran Elinav’ 


The intestinal microbiome is a signalling hub that integrates environmental inputs, such as diet, with genetic and immune 
signals to affect the host’s metabolism, immunity and response to infection. The haematopoietic and non-haematopoietic 
cells of the innate immune system are located strategically at the host-microbiome interface. These cells have the ability 
to sense microorganisms or their metabolic products and to translate the signals into host physiological responses and 
the regulation of microbial ecology. Aberrations in the communication between the innate immune system and the gut 


microbiota might contribute to complex diseases. 


he past two decades witnessed a revolution in our understand- 

ing of host-microbial interactions that led to the concept of the 

mammalian holobiont — the result of co-evolution of the eukar- 
yotic and prokaryotic parts of an organism. The revolution required two 
paradigm shifts that had a tremendous impact on their respective fields. 
The first occurred during the late 1990s with the discovery of pattern- 
recognition receptors (PRRs) in the innate immune system that sense 
microorganisms through conserved molecular structures. Several fami- 
lies of PRRs and their signalling pathways are now known, including 
the Toll-like receptors (TLRs), the nucleotide-binding oligomerization 
(NOD)-like receptors (NLRs), the RIG-I-like receptors, the C-type lec- 
tin receptors, the absent in melanoma 2 (AIM2)-like receptors and the 
OAS-like receptors’. These sensors are expressed by a variety of cellular 
compartments and constitute a continuous surveillance system for the 
presence of microorganisms in tissues. 

The second shift occurred fewer than 10 years later and was driven 
by the culture-independent characterization of the microbiome* — 
the entirety of the microorganisms that colonize the human body and 
their genomes. Because of the enormous number of microorganisms 
that reside on the surface of the body — the skin and the gastrointes- 
tinal, respiratory and urogenital tracts — it seemed improbable that 
innate immune recognition of microorganisms could be coupled to 
the immediate initiation of immune responses against them without 
leading to overt, organism-wide inflammation and its damaging effects. 
It was therefore hypothesized that microbial sensing at the body sur- 
face needs to be tightly controlled to ensure a symbiotic relationship 
between the host and its indigenous commensal microorganisms’, while 
allowing for the initiation of a rapid, sterilizing immune response on 
penetration of microorganisms into non-colonized sites. This idea was 
developed further after the realization that host-microbiota mutual- 
ism is lost in the absence of innate immune recognition of commen- 
sal microorganisms, with detrimental consequences for health*?. The 
crosstalk between innate immunity and the microbiome is now known 
to extend far beyond the achievement of a careful balance between toler- 
ance to commensal microorganisms and immunity to pathogens. The 
microbiota integrates into whole-organism physiology and influences 
multiple facets of organismal homeostasis through its effects on the 
innate immune system. Sensing by this system therefore serves as a 
rheostat for the metabolic activity of the microbiota and its exposure to 
diet and xenobiotics, as well as for the presence of mucosal infections. 
The information that is gathered is then processed at various levels of 


physiology to dynamically adjust the activity of the host to fit the state of 
the surrounding microbial ecosystem. Conversely, the innate immune 
system plays an important part in shaping the community and ecology 
of indigenous microorganisms into configurations that can be tolerated 
by the host and are beneficial for its metabolic activities. This complex, 
bilateral interaction between the host and its microbiota has a crucial 
role in human health. Many ‘multifactorial’ disorders, formerly con- 
sidered to be idiopathic, might therefore be influenced or even driven 
by alteration of the intimate crosstalk that occurs between the innate 
immune system and the microbiota during homeostasis. In this Review, 
we highlight paradigms of interactions between the innate immune 
system and the microbiota, the mechanisms that are involved in this 
crosstalk and how aberrations in either of the partners of this com- 
munication network contribute to the molecular aetiology of common 
multifactorial disorders. Because the roles of viruses, fungi and parasites 
have been summarized elsewhere®”, we focus on the interplay between 
the innate immune system and the bacterial microbiome. 


Physiological functions 

A network of interactions characterizes the interdependence between 
the innate immune system and the microbiota’. The two systems affect 
one another to orchestrate whole-organism physiology. 


Epithelial cells 
Although not classically considered to be bona fide cells of the innate 
immune system, intestinal epithelial cells are equipped with an extensive 
repertoire of innate immune receptors’ (Fig. 1). Expression of these 
receptors and active signal transduction on microbial recognition is 
pivotal for intestinal homeostasis because their epithelial-specific dele- 
tion leads to breaches in the epithelial barrier, which compromises the 
spatial separation between commensal bacteria and the lamina pro- 
pria of the intestines, thereby predisposing the tissue to spontaneous 
inflammation. This has been demonstrated for components that are 
involved in TLR signalling, including myeloid differentiation primary 
response protein MyD88, TNF receptor-associated factor 6 (TRAF6), 
and NF-«B essential regulator (NEMO)*”* ”, as well as for orchestra- 
tors of cell death such as receptor-interacting serine/threonine-protein 
kinase 1 (RIPK1), FAS-associated death domain protein (FADD) and 
caspase-8 (refs 13-16). 

NOD-containing protein 2 (NOD2), which is highly expressed 
in the Paneth cells of the small intestine, is activated by microbial 


'Department of Immunology, Weizmann Institute of Science, Rehovot 76100, Israel. 7Division of Internal Medicine, Tel Aviv Sourasky Medical Center, Tel Aviv 64239, Israel Research Center for 
Digestive Tract and Liver Diseases, Tel Aviv Sourasky Medical Center, , Tel Aviv 64239, Israel. *These authors contributed equally to this work. 


7 JULY 2016 | VOL 535 | NATURE | 65 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


Lumen as 
LPS Histamine 
Intestinal Taurine 
FA‘ 
epithelial cell " Sie Spermine 


i) 


Cytosol GPR109a 


Viral RNA 


| 


Circadian 
clock 


NOD1 


IL-18 


TIN JX 


Tight 
junction 


0, > HIF ——> (j- PXR 


Cl 


Flagellin 


Infected or neoplastic 


epithelial cell PrgJ 


Regllly, ReglllB 
Ang4, ITLN1, 


Indole mucus 


NAIP5 = NAIP2 


| | 


Neoplasia 


= 


~» 4) 


Modulation 
of ATP and 
ion levels 


( 


IFN-aR =CCL20 


Q 


ucocorticoids 


Lymphoid 

tissue genesis 
Figure 1 | Intestinal epithelial cells orchestrate the host-microbiota 
interface. Intestinal epithelial cells use the recognition of microbial-cell 
components and metabolites to adjust their antimicrobial programme 
and metabolic homeostasis. The activation of PRRs, such as TLRs and the 
NOD-like receptors NOD1 and NOD2, is directly coupled to the production 
of antimicrobial peptides (including ReglIIy, ReglII6, Ang4 and Itln1) 
and of mucus. IL-18 plays an important part in this process through an 
autocrine loop. The secretion of epithelial IL-18 requires transcriptional 
activation through TLRs or the G-protein-coupled receptor GPR109a and 
posttranscriptional cleavage through the NLRP6 inflammasome. NLRP6 can 
also be induced by type I interferons and functions as a sensor of viral DNA 
with pre-mRNA-splicing factor ATP-dependent RNA helicase DHX15. IL-18 
and IL-22 derived from immune cells also help to regulate the antimicrobial 
responses of epithelial cells. CCL20, which is derived from epithelial cells 


peptidoglycan and generates a cellular response that includes the 
secretion of cytokines, the induction of autophagy, intracellular vesicle 
trafficking, epithelial regeneration and the production of antimicrobial 
peptides, thereby influencing the composition of the microbiota”. 
Epithelial NOD1 is important for both the C-C motif chemokine 20 
(CCL20)-mediated generation of isolated lymphoid follicles in the intes- 
tine and homeostatic bacterial colonization”. 

PRRs in the epithelium are also important for the elimination of path- 
ogenic infection. Epithelial expression of the inflammasome-forming 
NLR family CARD-domain-containing protein 4 (NLRC4), a sensor 
of flagellin and bacterial secretion systems, promotes the expulsion of 
infected intestinal epithelial cells, thereby contributing to the elimina- 
tion of enteric pathogens”’””. NLRC4 also protects the host from intes- 
tinal carcinogenesis”, which provides evidence for a unified model in 
which epithelial NLRC4 protects the epithelial layer by identifying and 
dislodging cells that have undergone harmful insults. 

Signalling by the NACHT-, LRR- and PYD-domain-containing 
protein NLRP6 in intestinal epithelial cells is modulated by levels of 
amino acids and polyamines in the lumen of the intestine. It regu- 
lates the interface between the host and microorganisms through the 
production of inflammasome-mediated interleukin (IL)-18 and the 
downstream expression of antimicrobial peptides”, and it also controls 
the secretion of mucus by goblet cells”®. Deficiency in NLRP6 leads to 


66 | NATURE | VOL 535 | 7 JULY 2016 


Incretins 


oe J 
iu a 


fis fuze 


Immune cells 


Expulsion 


downstream of NOD1 signalling, is involved in the genesis of lymphoid 
tissue. NLRC4 promotes the expulsion of neoplastic or infected cells from the 
intestinal epithelium. PRR signalling also orchestrates the circardian clock 
within intestinal epithelial cells and adjusts the secretion of epithelial-derived 
metabolic hormones, such as glucocorticoids. Epithelial cells also respond 

to the levels of microbiota-modulated metabolites, such as SCFAs (acetate, 
butyrate, propionate), polyamines (spermine), as well as amino acids and 
products that are derived from them (taurine, histamine, indole). Taurine, 
histamine and spermine modulate the activity of inflammasome component 
NLRP6. Indole modulates the levels of incretin section and promotes the 
barrier function of the epithelium through the PXR, which helps to fortify 
tight junctions between cells. SCFAs serve as energy sources for epithelial cells 
and also support barrier function through HIF. ASC, apoptosis-associated 
speck-like protein containing a CARD; R, receptor. 


imbalances in the composition and function of the microbiota (dys- 
biosis), altered microbial biogeography and enhanced susceptibility 
to enteric infection?” **. Furthermore, NLRP6 has been described as 
a regulator of intestinal antiviral immunity”, which suggests that it 
might function in the control of both bacterial and viral parts of the 
microbiome. 

Other receptors also integrate microbial signals to adjust IL-18 lev- 
els, including hydroxycarboxylic acid receptor 2 (or G-protein-coupled 
receptor 109A), which is a receptor for butyrate and niacin*””’, the DNA 
sensor interferon-inducible protein AIM2 (ref. 32) and the inflamma- 
some component NLRP3. As a consequence, genetic deletion of these 
receptors leads to intestinal inflammation, tumorigenesis and suscep- 
tibility to enteric infection**™, which underlines the central role for 
epithelial IL-18 in orchestrating the intestinal host-microbial interface. 

Intriguingly, the impact of microorganisms on intestinal epithelial 
cells extends far beyond the classical immunological functions of these 
cells. Commensal colonization probably has a major role in the metabo- 
lism of intestinal epithelial cells. Microbiota-derived short-chain fatty 
acids (SCFAs) serve as an energy source for the epithelium and they 
affect both oxygen consumption and hypoxia-inducible factor (HIF)- 
mediated fortification of the epithelial barrier’. The microbial metabo- 
lite indole promotes barrier function through the pregnane X receptor 
(PXR; also known as nuclear receptor subfamily 1 group I member 


© 2016 Macmillan Publishers Limited. All rights reserved. 


2 (NR1I2))* and increases the secretion of glucagon-like peptide-1 
(GLP-1), an incretin with profound influences on host metabolism”. 
Microbiota-induced TLR signal transduction in intestinal epithelial 
cells also drives intestinal hormone production through the coordina- 
tion of the circadian clock, a transcription-factor network that rhythmi- 
cally controls the diurnal succession of cellular metabolic activity’. The 
microbiota itself undergoes rhythmic oscillations in composition and 
function*””, which suggests that the varying levels of microbial influ- 
ence on the innate immune system might underlie marked fluctuations 
over the course of a day. 

Taken together, intestinal epithelial cells integrate microbial signals 
into both the orchestration of the host-microbial interface, which con- 
sists of mucus and antimicrobial peptides, and the dynamic adjustment 
of cellular metabolism (Fig. 1). 


Myeloid cells 

Germ-free mice have a profoundly altered innate immune system. 
The microbiota influences the development and function of myeloid 
cells in multiple organs and at different time points during cellular 
development (Fig. 2). In the absence of the microbiota, myeloid-cell 
development in the bone marrow is reduced, which results in the 
delayed clearance of systemic bacterial infection”. The level of mye- 
lopoiesis correlates with the complexity of the intestinal microbiota 
and is adjusted in accordance with the level of TLR ligands that are 
present in blood serum”. Microbiota-derived SCFAs might similarly 
drive myelopoiesis in the bone marrow’. The influence of the micro- 
biota on myelopoiesis begins before birth. The offspring of mice that 
are treated with antibiotics during pregnancy have lower numbers of 
blood neutrophils and their bone-marrow precursors”, and gestational 
colonization with microorganisms increases the number of intestinal 
mononuclear cells in newborn mice“. 

The microbiome also influences the maturation of myeloid cells after 
haematopoiesis. The continuous presence of microbiota-derived TLR 
ligands drives the ageing of neutrophils**. The number of circulating 
basophils is likewise influenced by microbiome-derived TLR ligands”. 

In addition to affecting circulating myeloid cells, the microbiota 
strongly influences the biology of tissue-resident macrophages. 
Microglia, the macrophages of the central nervous system, display an 
altered morphology in germ-free mice — a phenotype that is, in part, 
due to a paucity of SCFAs**. In the skin, the microbiota influences the 
composition and inflammatory potential of resident myeloid cells”. 
In the lungs, treatment with antibiotics causes a shift in macrophage 
polarization that is mediated by prostaglandin E2, which enhances sus- 
ceptibility to allergic airway inflammation”. In the intestine, microbial 
SCFAs serve as a signal to alter the gene-expression profile of local mac- 
rophages”’”". The microbiota also regulates the trafficking of myeloid 
cells in the gut. Intestinal microbial colonization drives the continuous 
replenishment of macrophages in the intestinal mucosa by monocytes 
that express C-C chemokine receptor type 2 (CCR2)”. 

The tissue-specific effects of the microbiome on resident myeloid 
cells go beyond bona fide immunological functions. Signals that are 
released by the microbiota might influence the interactions between 
neurons of the enteric nervous system and intestinal muscularis 
macrophages to facilitate gastrointestinal motility’’. Commensal 
microorganisms regulate both the expression of bone morphogenetic 
protein 2 (BMP2) by muscularis macrophages and the production 
of colony-stimulating factor 1 (CSF1; also known as macrophage 
colony-stimulating factor 1) by enteric neurons, which in turn influ- 
ences smooth-muscle contractions in the intestinal muscle layer”. The 
microbiome also has an influence on tissue recovery after injury. A 
2015 study found that the intestinal microbiota sustains inflamma- 
tion and lymphadenopathy after infection with Yersinia pseudotuber- 
culosis™, thus compromising the return to homeostatic tissue-specific 
immunity. 

Such findings suggest that colonization by commensal microorgan- 
isms profoundly shapes the myeloid landscape of the host, both in 


REVIEW 


we hy she 
4 aie 
ool Xv 2 i 0 = ww 8 ae aie QY 
lelelele le[elele| er i al lel W alelele Tele 
Modulation of PRR ligands 
tissue-specific Metabolites 
mediators 
Intestines Blood vessel Myeloid 


progenitor cell 


Macrophage 


Bone-marrow 
myelopoiesis 

* Elevated number and 
division of precursors 
* Regulation of 

gene expression 


Tissue-resident 
myeloid cells 

* Orchestration of 
macrophage migration 
+ Regulation of 

gene expression 


Circulating 

myeloid cells 

* Promotion of 
neutrophil ageing 

« Maintenance of 
basophil homeostasis 


Figure 2 | The integration of microbial signals by myeloid cells. The 
microbiome influences the function of myeloid cells at all stages of their 
development. The influence of the microbiome on the migration and gene 
expression of tissue-resident myeloid cells is achieved mainly through the 
modulation of local metabolites and mediators of tissue identity. Circulating 
granulocytes are influenced by microbial PRR ligands. Myelopoiesis in the 
bone marrow is reduced in the absence of commensal bacteria and their 
microbial products in the blood. 


mucosal tissues and systemically. Local concentrations of microbiota- 
derived metabolites, as well as systemic levels of microbial products, 
seem to drive myeloid-cell differentiation and function through PRR 
signalling. Notably, these microbiota-driven alterations in the myeloid- 
cell pool greatly influence the susceptibility of the host to a variety 
of disorders, which range from infection and sepsis*** to allergy, 
asthma’ and graft-versus-host disease”’. They also regulate the effec- 


tiveness of vaccination” and therapies for cancer”. 


Innate lymphoid cells 

The influence of the microbiota is not limited to the development of 
the myeloid arm of the innate immune system. However, the regulation 
of innate lymphoid cells by the microbiota seems to follow rules and 
mechanisms that are different from the principles applied to myeloid- 
cell regulation (Fig. 3). Innate lymphoid cells (ILCs), a recently dis- 
covered lymphocyte branch of the innate immune system, develop 
normally in the absence of the microbiota”, but the proper functioning 
of ILCs is dependent on commensal microbial colonization” ™. Rather 
than exerting their effect during lymphopoiesis, signals that stem from 
commensal microorganisms seem to influence the maturation and 
acquisition of the tissue-specific functions of ILCs. 

The ILC family consists of cytotoxic cells (natural killer cells) and 
non-cytotoxic subsets (ILC1, ILC2 and ILC3). Most studies that exam- 
ine the influence of the microbiota on ILCs have focused on ILC3. The 
importance of ILC3 cells in host-microbiota interactions became clear 
when their depletion — and the resulting abrogation of IL-22 produc- 
tion — was shown to produce a loss of bacterial containment in the 
intestine®. The microbiota also influences ILC3 interactions with 
other components of the immune system. The presentation of micro- 
bial antigens by ILC3s limits commensal-specific T-cell responses™ to 
maintain tolerance to commensal bacteria®. Microbial sensing and the 
production of IL-1 by intestinal macrophages drive granulocyte-mac- 
rophage colony-stimulating factor (GM-CSF) secretion by ILC3s, which 
is required for macrophage function and the induction of oral toler- 
ance”, Flagellin sensing by myeloid cells that carry the CD103 antigen is 
required for the IL-23-mediated production of IL-22 by ILCs”. Further- 
more, the production of lymphotoxin-a (also known as tumour necrosis 


7 JULY 2016 | VOL 535 | NATURE | 67 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


Lumen Intestinal Flagellin 
epithelial cell ae AhR ligands 
int Mucus layer 
rayata) ANA al f\ 
AMPs 
IL-25 Barner 
Amphiregulin IL-33 fortification 
IL-13 TSLP IL-7 
Basophil IL-22 
Jia TLR-5 
ILede *y) Prostaglandin D2 @ IL-23 @ 
< Pe” ( 
IFN-y Ss IL-1 eS 
ILC1 ILC2 ILC3 


Dendritic cell Mast cell 


Antigen 
presentation 


T cell 


Figure 3 | The integration of microbial signals by ILCs. ILCs communicate 
with the local microbiota through cytokines, PRR ligands and antimicrobial 
peptides. In many cases, epithelial cells or myeloid cells serve as relay 
stations for crosstalk between ILCs and the microbiota. Group 1 ILC (ILC1) 
cells can be activated by myeloid-cell-derived IL-12. Group 2 ILC (ILC2) 
cells are activated by epithelial-derived cytokines and orchestrate type 2 


factor-8) by ILC3s is crucial for the production of IgA and for micro- 
biota homeostasis in the intestine®. An equally important microbiota- 
instructed function of ILCs is their communication with epithelial cells. 
Microbiota-induced IL-22 production by ILC3s induces expression of 
the enzyme fucosyltransferase 2 (galactoside 2-a-L-fucosyltransferase 2) 
and fucosylation of surface proteins by intestinal epithelial cells, which 
is required for host defence against enteric pathogens”. 

Although these examples highlight the importance of microbial sig- 
nals for the maturation and function of ILCs, the precise mechanisms 
through which they exert their influence remain unclear and are, in 
some cases, controversial. For instance, some studies have reported ele- 
vated levels of IL-22 by ILCs in the absence of the microbiota, whereas 
others have documented the abrogation of IL-22 secretion”. Different 
conclusions have also been reached in relation to whether the number 
of tissue-resident ILCs is altered in mice that are germ-free or have been 
treated with antibiotics”. Further studies are needed to reconcile these 
observations and their underlying mechanisms. 

The microbiota might also influence the activity of the other ILC 
subsets. ILC2s are activated by epithelial tuft-cell-derived IL-25 
(ref. 71), which is produced in a microbiota-dependent manner™. 
Deletion of the ILC1-lineage transcription factor T-bet (also known 
as T-box transcription factor TBX21) in the innate immune system 
results in ILC-dependent and Helicobacter typhlonius-driven inflam- 
mation of the intestines”. 

Collectively, the myeloid and lymphoid branches of the innate 
immune system are shaped by the microbiota, but the underlying 
mechanisms are based on distinct principles. A scenario could be envi- 
sioned in which the complexity of commensal microbial colonization is 
reflected in the amount of circulating PRR ligands and the concentra- 
tions of microbiota-derived metabolites in tissues, both of which tune 
the level of myelopoiesis, as well as the system's inflammatory capacity, 
over the short-term. By contrast, ILC development might be hardwired 
to anticipate microbial colonization. Tissue-resident ILCs would then 
integrate signals from the microbiota, through regulatory mechanisms 
that are not fully understood, to fine-tune innate and adaptive immune 
responses at the tissue level. 


68 | NATURE | VOL 535 | 7 JULY 2016 


AN: 
@ » 


Macrophage 


petgen 1.17 Naas 
presentation IL-18 


Eosinophil T cell Neutrophil Macrophage 


immunity through their interactions with mast cells, eosinophils, basophils 
and macrophages. Group 3 ILC (ILC3) cells interact with cells of both the 
innate and adaptive immune systems. They also secrete IL-22, which initiates 
an antimicrobial programme as well as barrier fortification in epithelial 

cells. AHR, aryl hydrocarbon receptor; AMPs, antimicrobial proteins; LT, 
lymphotoxin; TSLP, thymic stromal lymphopoietin. 


Effects of the innate immune system on the microbiome 

On sensing information about the metabolic state of the microbiota, 
the innate immune system relays signals to the host to adapt tissue-level 
physiology and might also adjust the composition and function of the 
microbiota. Genetic evidence from humans and mice indicates that the 
innate immune system plays an important part in regulating variations 
in microbiota composition over time and between individuals”. Dys- 
biosis has been reported in several mouse models of innate immune 
deficiency’, such as in mice that lack the genes NOD2 (refs 17, 19, 74), 
NLRP6 (ref. 27) or TLRS (ref. 75). The innate immune system might 
therefore function to promote the growth of beneficial members of 
the microbiota and to contribute to the maintenance of a stable com- 
munity of microorganisms. This is best demonstrated by the induc- 
tion of epithelial fucosylation by ILC3s and IL-22. During starvation 
that is associated with intestinal infection, the shedding of fucosylated 
proteins into the intestinal lumen serves as a source of energy for com- 
mensal bacteria’*’. Innate-immune-system resources therefore can be 
mobilized to support the microbiota during perturbations of the intes- 
tinal ecosystem. Similarly, TLR1 signalling is required to maintain the 
composition of the microbiota after Yersinia enterocolitica infection”. 
By contrast, PRRs do not seem to play a part in the development of the 
microbiota after treatment with antibiotics has ended’*. However, it 
remains possible that activities of the microbiota that are independent 
of PRRs are involved in controlling the succession of microbial colo- 
nization after catastrophic events in the ecosystem. 

The mechanisms through which the microbiota controls the devel- 
opment of the innate immune system are beginning to be understood, 
although the principles and purpose of innate-immune control over 
temporal dynamics in microbiota function remain unknown. Future 
mechanistic studies need to better define the characteristics of a 
‘healthy’ microbiome that the host immune system attempts to pre- 
serve. Insights into such mechanisms came from the finding that dysbi- 
osis in NLRP6-deficient mice was associated with similar metagenomic 
functions as were being studied in different animal facilities”. Dys- 
biosis developed de novo after the colonization of germ-free NLRP6- 
deficient mice, which indicates that certain PRRs might create specific 


© 2016 Macmillan Publishers Limited. All rights reserved. 


antimicrobial landscapes that are associated with the preservation of 
distinct functions of the microbiome. 


Mechanisms of system crosstalk 

A wide range of physiological contexts are influenced by communi- 
cation between the microbiota and the innate immune system, and it 
is interesting to consider the molecular and cellular mechanisms that 
mediate this communication at the functional level. Commensal micro- 
bial colonization is known to influence the activity of the innate immune 
system according to a number of common principles. 


Transcriptional reprogramming 

One of the most striking observations made in germ-free mice was the 
reprogramming of intestinal gene expression in animals that were colo- 
nized with a single commensal bacterium” or a single enteric virus”. 
This includes the expression of genes that are involved in host nutrient 
absorption and processing, barrier functions, gut motility, intestinal 
immune responses, angiogenesis and the metabolism of xenobiotics. 
Studies of germ-free mice and of natural microbial colonization during 
postnatal development have substantiated such findings by showing that 
transcriptional reprogramming of the intestine by the microbiota spans 
different regions of the gastrointestinal tract and is partially dependent 
on microbial sensing receptors of the innate immune system*. The 
impact of the microbiome on transcription reaches beyond the intestine. 
For instance, the livers of germ-free mice show massive alterations in 
the expression of a range of genes with metabolic and non-metabolic 
functions™. 

The transcriptional responses of the host to bacterial colonization 
are in part evolutionarily conserved, as shown by reciprocal microbiota 
transplantations between mice and zebrafish™. Yet there is a consider- 
able degree of species specificity in host responses to microbial coloniza- 
tion, especially with respect to the maturation of the immune system”. 
Although such examples underline the importance of transcriptional 
responses to commensal colonization for the innate immune system, 
several lines of evidence suggest that regulation also occurs through 
mechanisms other than gene expression. Constituents of the microbiota 
have been implied in the regulation of ubiquitin signalling*’, protein 
neddylation®”’*, the nuclear translocation of RelA (also known as tran- 
scription factor p65) (ref. 89) and vesicle trafficking”, which indicates 
that the full regulatory reach of commensal microorganisms is yet to 
be defined. 


Epigenetic programming 
Because a large fraction of the transcriptome is shaped by the microbi- 
ome in an organ-specific manner, gene regulatory mechanisms must 
integrate microbial signals into the orchestration of gene expression. 
Although it is appreciated that bacterial pathogens can modulate host 
epigenomics, the epigenetic interpretation of commensal microbial colo- 
nization by the innate immune system is only starting to be investigated. 
Onan organismal scale, mediation of the transcriptional reprogramming 
of gene expression in the intestine by the open chromatin landscape 
was ruled out because the chromatin accessibility in germ-free mice 
is similar to that in colonized mice”’. Instead, microbial regulation of 
gene transcription in the host might be achieved by differential expres- 
sion of specific transcription factors and their binding to chromatin. 
The exploration of this possibility on an organismal scale could reveal 
potential regulatory pathways through which information on the state of 
the microbiota is integrated into the chromatin landscape of host tissues. 
Specific examples of this phenomenon exist in the context of the innate 
immune system. Analysis of epigenetic modifications in the intestinal 
epithelial cells of germ-free mice revealed a low level of methylation on 
the gene that encodes the lipopolysaccharide sensor TLR4, which indi- 
cates that commensal bacteria might induce tolerance through the epige- 
netic repression of PRRs”. Microbial colonization of germ-free neonatal 
mice was found to decrease the methylation level of the chemokine- 
encoding gene Cxcl16, which reduced its expression and diminished 


REVIEW 


the recruitment of invariant natural killer T cells, ameliorating colitis 
and allergic asthma’’. A comparison of mononuclear phagocytes from 
colonized and germ-free mice revealed that the microbiota promotes the 
trimethylation of histone H3 at lysine 4 at the loci of inflammatory genes, 
including those which encode the type I interferons”. The acetylation 
of histones is similarly involved in the crosstalk between the microbiota 
and the innate arm of the immune system. When histone deacetylase 3 
is specifically deleted from intestinal epithelial cells, gene expression is 
massively altered and the integrity of the epithelial barrier is lost™. These 
aberrations are known to be microbiota-dependent because germ-free 
mice that lacked intestinal histone deacetylase 3 do not present the same 
phenotype as their colonized counterparts”. 

Although the microbial signals that are responsible for specific 
epigenetic alterations are mostly unknown, it seems probable that 
microbial metabolites, rather than just the presence or absence of micro- 
organisms, mechanistically influence the orchestration of histone modi- 
fications. For instance, the microbiota-derived SCFA butyrate was shown 
to modulate the immune response of colonic macrophages through the 
inhibition of histone deacetylases”’, with a potential contribution to the 
maintenance of immunological tolerance to commensal microorgan- 
isms. Transcriptional reprogramming through epigenetic modifications 
is therefore a prominent mechanism by which the microbiota exerts 
its influence on host innate immunity. The elucidation of the precise 
mechanisms through which microbial molecules influence host-cell epi- 
genomes and adjust the transcriptome to respond to the state of micro- 
bial colonization is an exciting area for future research. 


Hierarchical feedback loops 

The local containment and functional maintenance of a microbial eco- 
system within the host is a formidable challenge for the mammalian 
innate immune system. Co-evolution between the microbiota and 


Intestinal PRR ligands, 

lumen e@ ve AMPs metabolites, antigens 
Mucus —— 
layer ’ 

Epithelium Transcriptional 
reprogramming 
Epithelial Epigenetic 
relay signals modification 
Cytokines WA nN | 
Lamina Polat 
propia ( emokines 
Y) 
ILC Macrophage 
Myeloid-lymphoid cell 
communication 
Migration Lymphatic and 
portal circulation 
Lymph 
node or 
liver 


T cell 
Antigen presentation 


Dendritic cell 


Figure 4 | The hierarchy of anatomy in microbiome-innate-immune- 
system interactions. Feedback loops between the host and the microbiome 
can be restricted to the epithelial layer of the intestinal wall, in which they 
consist of a brief circuit that links microbial sensing with transcriptional 
reprogramming and antimicrobial responses. A prototypical cytokine for 
such communication is the paracrine IL-18. Feedback loops that extend to 
the underlying lamina propria involve communication between epithelial, 
myeloid and lymphoid cells using cytokines and chemokines. Examples 

of cytokines that mediate such interactions are IL-22 and IL-23. Microbial 
products can also reach the draining lymph node and liver, where dendritic 
cells regulate anticommensal T-cell immunity to promote microbial 
containment. AMPs, antimicrobial peptides. 


7 JULY 2016 | VOL 535 | NATURE | 69 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


the host has led to the development of sophisticated feedback loops 
to accomplish this task. These loops can be regulated by various layers 
of cells within the intestinal wall. Although they are often restricted 
to the epithelium, which is directly exposed to the microbiota, they 
sometimes extend into the underlying mucosal lamina propria or even 
the lymphatic and portal circulation (Fig. 4). 

In evolutionary terms, feedback loops that are restricted to the epithe- 
lium could represent the most ancient form of host-microbiota inter- 
action. Such loops consist of only three steps: first, the recognition of 
microbes by PRRs; second, the transcriptional response of the host; and 
third, the secretion of effector molecules. The advantage of using such 
confined regulatory circuits is that the inflammatory response can be 
limited to the epithelial layer, without involving entire tissues or multi- 
ple organs. Examples include the epithelial-autonomous regulation of 
antimicrobial-peptide and mucus secretion by NLRP6 and NOD2, as 
well as the control of intestinal epithelial cell death by NLRC4, which 


all occur without the apparent contribution of other regulatory layers 
of cells t7719782225-28:74,95-97 


The crosstalk between the innate immune system and the microbi- 
ome can also extend to the lamina propria. Microbial sensing by myeloid 
cells of the lamina propria provides regulatory signals that are crucial 
for the maintenance of commensal mutualism and the initiation of 
inflammatory responses in the host®””*. Myeloid cells modulate impor- 
tant pathways such as IL-22 production by ILCs, which induces the 
production of epithelial regenerating islet-derived protein 3 (RegIID6 
and ReglIIy, antimicrobial peptides that are important for maintaining 
a spatial separation between the majority of commensal bacteria and 
the intestinal epithelial layer, and this modulation is also pivotal for the 
local containment of commensals'**”. 

Regulatory circuits that reach the lymphatic and portal circulation 
represent a further level of interaction between the microbiome and the 
immune system. Migration to the mesenteric lymph nodes of antigen- 
presenting cells that carry material from commensal gut microbes is 
essential for the induction of commensal-specific adaptive immune 
responses'””’””, Likewise, dendritic cells carry microbial antigens from 
colonized skin to the draining lymph nodes, where the production of 
cytokines determines the signature of the anticommensal immune 
response”. A similar ‘firewall’ might apply in the liver, which microbial 
products access through the portal vein’. 

Multiple levels of anatomy therefore contribute to the innate immune- 
mediated containment of the microbiota and to the tailoring of the 
immune response to the tissue-specific characteristics of host-micro- 
biota interactions. 


Impact on diseases 

The interactions between the host and its microbiota are crucial for 
the preservation of tissue homeostasis. It is unsurprising therefore 
that perturbed interactions have emerged as a pivotal driver of various 
chronic disease states (see page 94). Three concurrent themes of inter- 
actions between the microbiome and the innate immune system are 
emerging as important contributors to microbiome-mediated disease 
phenotypes. First, microbial products might serve as perpetual stimuli 
of chronic immune responses, which contribute to the occurrence of 
non-resolving inflammation. For instance, microbial signals can sustain 
inflammation and tissue damage after infection-induced injury to the 
mucosa™. Second, abnormal microbial development during maturation 
of the innate immune system results in a failure to induce immuno- 
logical tolerance, which then leads to exacerbated autoimmune and 
autoinflammatory disorders later in life. An example of this is the con- 
dition allergen-induced airway hyperreactivity”. Third, the microbi- 
ome greatly influences the factors that control tissue-specific immunity 
through mechanisms that can be active even at sites that are distant from 
the microbiome’”’. Therefore, dysbiosis can trigger pathophysiologies 
at remote organs and manifest as distinct symptoms in the context of 
‘sterile’ tissues. For instance, intestinal dysbiosis drives the remodelling 
of the haematopoietic stem-cell niche in the bone marrow, and it also 


70 | NATURE | VOL 535 | 7 JULY 2016 


alters the differentiation of progenitor cells in the context of obesity’. 


A number of medical conditions that occur in people, or the equiva- 
lent conditions in animals, demonstrate how aberrations in the crosstalk 
between the innate immune system and the microbiome can contribute 
to pathogenesis on a molecular and cellular level (Fig. 5). 


Infection 

The microbiota contributes to the health of the host by colonizing the 
mucosal entry sites of pathogens, where it occupies biological niches 
and prevents invasion of the ecosystem by foreign elements — a concept 
known as colonization resistance (see page 85). In addition to its direct 
mediation of niche competition, the microbiota mediates resistance to 
infection indirectly by stimulating the innate immune response. 

A prominent example of this is the intestinal immunity to viral infec- 
tions that occurs when the host response is impaired by antibiotic- 
mediated depletion of commensal bacteria'*"”. Effective antiviral 
innate immunity in the intestine is achieved through the induction of 
interferon (IFN)-A and IL-18 or IL-22 (refs 110, 111) pathways, which 
then cooperate to induce the activation of signal transducer and activa- 
tor of transcription 1 (STAT1) and antiviral genes'”’. Although IL-18 
and IL-22 are induced by commensal bacteria, the expression of IFN-A 
is suppressed by the microbiota, which enables efficient viral persis- 
tence’”’. Similarly, certain viruses can hijack interactions between bac- 
terial molecules and the innate immune system, such as LPS-TLR4 
signalling, to ensure their efficient transmission'*"””. 

The microbiome and innate immune system also cooperate in the 
eradication of bacterial infection. Sometimes, neither innate immu- 
nity nor colonization resistance is sufficient to ensure the expulsion of 
pathogens. Instead, a combination of the two is required, as in the case 
of cooperation in the host defence against Citrobacter rodentium!!*""’, 
a bacterium that can cause disease in mice. However, such combinato- 
rial responses can be subverted by the pathogen. During infection with 
Salmonella Typhimurium, microbiota-induced IL-22 elicits a response 
that targets commensal bacteria and liberates a colonization niche for 
the pathogenic bacterium’*. Porphyromonas gingivalis, an oral bacte- 
rium that is associated with periodontitis, evades the host by modulat- 
ing the TLR2 pathway to support a niche for dysbiosis and subsequent 


inflammation'”. 


Autoimmunity and autoinflammation 
Inflammatory bowel disease (IBD) is a group of chronic inflammatory 
disorders of multifactorial aetiology that affect the gastrointestinal tract 
and extraintestinal organs. These disorders provide models for studying 
perturbed crosstalk between the microbiota and the innate immune 
system because they integrate all aspects of mucosal immunology at 
the interface between microbial colonization and innate-immune-sys- 
tem activation. They also clearly demonstrate how limitations in our 
mechanistic understanding of this crosstalk hamper the development 
of treatments for common human disorders. Dysbiosis has a central 
role in the pathogenesis of IBD, and the introduction of bacteria that are 
associated with IBD into a murine model of colitis resulted in chronic 
disease'’*”’, which suggests that immune dysfunction as an adjunct to 
specific microbial alteration is necessary for the development of IBD. 
Despite large-scale efforts, however, no particular species or group of 
commensal or pathogenic microorganisms has been identified as the 
cause of IBD in humans. Instead, multiple mechanisms at the interface 
between the innate immune system and the microbiome, such as micro- 
bial sensing, the release of reactive oxygen species and antigen process- 
ing, were hypothesized to contribute to the molecular pathophysiology 
of IBD’”'. Genome-wide association studies in humans have found 
allelic variance in several of the genes that regulate the innate immune 
system. These include: NOD2 (refs 122, 123), which is linked to activa- 
tion of the immune system by peptidoglycans; ATG16L1 (refs 124, 125), 
which has a role in autophagy; and CLEC 7A®, which is involved in the 
recognition of fungi by dendritic cells. 

Dysbiosis might also promote other extraintestinal inflammatory 


© 2016 Macmillan Publishers Limited. All rights reserved. 


and autoimmune disorders, although the underlying mechanism is yet 
to be completely unravelled. Type 1 diabetes is associated with micro- 
biota compositions that are characterized by low diversity and the 
expansion of distinct groups of bacteria’””. Non-obese diabetic mice, 
an animal model for type 1 diabetes, could be phenotypically rescued by 
the deletion of the gene Myd88. However, germ-free, MyD88-deficient 
non-obese diabetic mice do develop type 1 diabetes, which could be 
attenuated by faecal transplantation, demonstrating that microbiota- 
innate-immune-system interactions can modify the disease’**. Rheuma- 
toid arthritis was found to associate with an overabundance of Prevotella 
copri and a propensity to develop colitis’”’. Such examples suggest that 
even classic autoimmune diseases might contain an autoinflammatory 
component that is driven by perturbed communication between the 
host and the microbiota. 

Interactions between the microbiota and the innate immune system 
also participate in pulmonary and atopic phenomena. Commensal 
bacteria have been shown to protect against food allergy and allergic 
airway inflammation; germ-free mice and mice treated with antibiotics 
develop exacerbated disease*”*®'*’. Mice that are deficient in TLR2 or 
TLR4 develop pulmonary damage on the chronic intake of a high-fat 
diet. This damage is abrogated in germ-free mice or mice that consume 
antibiotics, and it can be transmitted to wild-type mice by faecal trans- 
plantation’’’. Together, these findings reveal the trialogue that exists 
between the microbiota, the host and environmental factors and that 
contributes to common idiopathic diseases. 


Metabolic syndrome 

Obesity has become a global-health problem; in 2014, approximately 
40% of the population worldwide was overweight and 13% was obese, 
according to the World Health Organization. The association of obesity 
with other metabolic derangements, such as type 2 diabetes, hyperten- 
sion, dyslipidaemia and non-alcoholic fatty liver disease, is known as 
metabolic syndrome. This complex of conditions is highly associated 
with cardiovascular morbidity and mortality, and it has become the 
leading cause of death worldwide (see page 56). 

Obesity and type 2 diabetes are associated with chronic low-grade 
inflammation and an increased expression of PRRs in adipose tissue, 
muscle tissue and in circulating monocytes’. Both conditions also 
trigger dysbiosis, which is consistent with the idea that diet and PRR 
activation shape the microbial composition of the gut’*’. In mice, cer- 
tain deficiencies of innate-immune receptors induce metabolic aber- 
rations and dysbiosis, which can be transferred to wild-type mice by 
faecal transplantation and abrogated by treatment with antibiotics”. 
The microbiota, innate immunity and metabolic syndrome are directly 
linked through the secretion of IL-22 by ILCs, a mechanism that was 
found to preserve the integrity of the intestinal mucosal barrier, thereby 
alleviating metabolic disorders’™. 

Other constituent conditions of metabolic syndrome, such as hyper- 
tension and dyslipidaemia, have also been linked to intestinal bacteria. 
The bacterial composition of stool samples obtained from people with 
these conditions feature dysbiosis and reduced taxonomic diversity". 
The pathogenesis of non-alcoholic fatty liver disease is linked to interac- 
tions between the microbiota and the innate immune system of the host. 
Deficiencies in inflammasome components exacerbate non-alcoholic 
fatty liver disease owing to the induction of colonic inflammation and 
a subsequent increase in the release of TLR agonists from the gut and 
their arrival at the liver through the portal circulation’. 

Atherosclerosis, a progressive inflammatory process that is another 
component disorder of metabolic syndrome, involves the accumula- 
tion of lipids and the formation of plaques around arterial walls. This 
pathology was linked to the intestinal microbiota as a result of several 
observations. 

First, the administration of antibiotics was shown to confer beneficial 
effects on cardiovascular risk factors in a murine model of atherosclero- 
sis’. Second, some of the bacterial species in atherosclerotic plaques are 
common to both the oral and intestinal microbiota, and the presence or 


REVIEW 


Re 


Rheumatoid arthritis Type 1 diabetes 
Ankylosing spondylitis 


‘25 
we Innate 


Microbiota immune system 


ee 


Diet 


Inflammatory 
bowel disease 


Non-alcoholic 
fatty liver disease 


Pulmonary disease 
and atopy 


ws 


Carcinogenesis Obesity Atherosclerosis 


Figure 5 | Microbiome-innate-immune-system interactions are involved 
in multifactorial diseases. Many inflammatory disorders are influenced by 
alterations in the crosstalk between innate immunity and the microbiome. 
These include metabolic (red boxes), neoplastic (orange box) and 
autoimmune or autoinflammatory (blue boxes) disorders. Modulation of the 
severity of a disorder through dietary interventions and their influence on 
microbiome-immune interactions is an exciting area of research. 


absence of these groups correlate with levels of cholesterol in the blood 
plasma’. Third, metabolomic analysis revealed that trimethylamine 
N-oxide, a phospholipid that is found in red meat and is metabolized 
exclusively by intestinal microbiota, promotes atherosclerosis and 
increases the risk of cardiovascular diseases'*”'””. Intriguingly, the tar- 
geted inhibition of trimethylamine N-oxide attenuates features of ath- 
erosclerosis, which paves the way for a microbiota-mediated therapeutic 
approach to the treatment of cardiovascular diseases’. Atherosclerosis 
is also dependent on the host’s innate immunity, because a deficiency in 
Myd88, specific TLRs or components of the inflammasome suppresses 
the condition in murine models. 


Cancer 
The idea that chronic inflammation drives carcinogenesis has been 
widely established in various tissues. For example, hepatocellular 
carcinomas arise in people with chronic hepatitis, colorectal cancer 
can occur in people with longstanding untreated IBD and Marjolin’s 
ulcers develop on chronically inflamed skin. The presence of bacteria at 
tumour sites was first described more than a century ago, so it is surpris- 
ing that the role of the microbiota in tumourigenesis has only recently 
been recognized. Colorectal carcinogenesis is triggered by a combina- 
tion of microbiota- and host-dependent mechanisms. Certain bacteria 
promote carcinogenesis directly, through the secretion of substances 
that elicit DNA damage’. Prominent examples include the excessive 
release of nitric oxide from immune cells that is triggered by Helicobac- 
ter hepaticus, the production of reactive oxygen species by Enterococcus 
faecalis and the secretion of an enterotoxin by Bacteroides fragilis, which 
activates the oncogene c-MYC. Other bacteria drive carcinogenesis indi- 
rectly by sustaining a proinflammatory microenvironment, such as the 
production by Fusobacterium nucleatum of the virulence factor FadA, 
which increases the paracellular permeability of colonic epithelial cells. 
Inflammation might also promote community-level alterations in the 
microbiome and facilitate bacterial translocation into neoplastic tissue, 
which further promotes the expression of inflammatory cytokines and 
leads to the increased growth of tumours. Dysbiosis that arises in 


7 JULY 2016 | VOL 535 | NATURE | 71 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


the absence of NLRP6 promotes the development of cancer through 
IL-6-induced epithelial proliferation”. 

The influence of the microbiota on innate immunity has been shown 
to affect the host response to cancer therapy. For example, germ-free 
mice and mice that are treated with antibiotics both show a diminished 
response to immunotherapy by CpG oligonucleotides and chemo- 
therapy owing to the impaired function of myeloid-derived cells in 
the tumour microenvironment”. Furthermore, commensal Bifidobac- 
terium enhances immunity to tumours through antibodies directed 
against programmed cell death 1 ligand 1 (PD-L1) through the aug- 
mentation of dendritic-cell function’. These studies might open up 
a fascinating avenue of research to prevent cancer and develop cancer 
therapeutics through manipulation of the microbiota. 


Future directions 

The importance of the innate-immune sensing of commensal micro- 
organisms was recognized merely a decade ago. Since then, multiple 
levels of interaction between the microbiota and the cells of the innate 
immune system have been uncovered, which range from molecular 
events at the level of individual cells to the physiology of entire organs. 
The importance of the microbiome in mammalian health and disease 
is clearly recognized, and in many cases the innate immune system 
provides the causal link between disease-associated microbial altera- 
tions and the pathophysiological mechanisms of the host. Nonetheless, 
very few of the insights gained from the study of microbiome-innate- 
immune-system interactions have been used to develop clinical thera- 
pies for inflammatory diseases. In the next decade, research in the field 
must therefore reach a number of milestones that will help to harness 
our knowledge to provide clinical applications. 

First, the majority of insights so far have been gained from stud- 
ies of mouse models. The relevance of these principles for microbi- 
ome-innate-immune-system interactions in humans remains to be 
determined. 

Second, knowledge of how the microbiome influences the innate 
immune response is based mostly on well-known examples and might 
not fully represent the scope of possible mechanisms. Systematic stud- 
ies that screened members of the microbiome for their effects on the 
immune system suggest that the range of commensal bacteria that mod- 
ulate the maturation of the immune system might be far larger than was 
previously anticipated'**. Whether the entirety of microbiota—innate- 
immune-system interactions can be classified according to a limited 
number of paradigms — that is, whether certain groups of bacteria use 
common mechanisms to modulate the innate immune system — is still 
to be uncovered”. 

Third, in comparison to its effect on the adaptive immune system 
(see page 75), very little is known about the bacterial species, effector 
molecules and molecular mechanisms through which the microbiota 
exerts its immune-modulating effect on the cells of the innate arm of the 
immune system. Because it lacks antigen specificity, the innate immune 
system might act by broadly evaluating the activity of the microbiome 
through tissue-level microbial sensing rather than by responding to 
particular species of bacteria. A comprehensive characterization of the 
bacterial components and metabolites that are sensed by the innate 
immune system, through either PRRs or other sensors, as well as their 
effects on the transcriptional and post-transcriptional landscape of the 
host, will greatly facilitate our ability to understand the molecular aetiol- 
ogy of microbiome-driven disorders. 

Fourth, our deepening knowledge about the interactions between the 
innate immune system and the microbiome will ultimately result in the 
development of therapeutic approaches that target these processes. Such 
interventional strategies, especially when applied to humans, should 
take into account the enormous variation in both microbiome configu- 
rations and innate immune responses that exists between individuals”. 
However, the fact that the microbiome is amenable to rapid change 
through dietary interventions could be exploited to construct tailored 
diets that alter microbiome function and downstream innate immune 


72 | NATURE | VOL 535 | 7 JULY 2016 


responses to influence common, multifactorial disorders. Dietary modi- 
fication might alter the microbiome in a way that would enable it to be 
primed for subsequent immunomodulatory interventions, thereby inte- 
grating both treatment modalities (Fig. 5). Alternatively, the identifica- 
tion of ‘postbiotic’ bioactive microbiome-modulated compounds might 
allow common downstream pathways in the host to be targeted, thereby 
influencing the development and outcome of disorders. The future of 
immunotherapy might therefore combine direct, drug-based immune 
modulation with microbiome and metabolome modification to col- 
lectively target both microbial and host components of the molecular 
aetiology of disease. = 


Received 8 December 2015; accepted 15 April 2016. 


1. Thaiss, C.A., Levy, M., Itav, S. & Elinav, E. Integration of innate immune signaling. 
Trends Immunol. 37, 84-101 (2016). 

2. Eckburg, P. B. et al. Diversity of the human intestinal microbial flora. Science 
308, 1635-1638 (2005). 

3. Shibolet, O. & Podolsky, D. K. TLRs in the gut. lV. Negative regulation of Toll-like 
receptors and intestinal homeostasis: addition by subtraction. Am. J. Physiol. 
Gastrointest. Liver Physiol. 292, G1469-G1473 (2007). 

4. Rakoff-Nahoum, S., Paglino, J., Eslami-Varzaneh, F., Edberg, S. & Medzhitov, R. 
Recognition of commensal microflora by Toll-like receptors is required for 
intestinal homeostasis. Ce// 118, 229-241 (2004). 

Refs 4 and 5 highlight the importance of innate-immune-system recognition 
of the microbiota for host-microbiota homeostasis. 

5. Slack, E. et al. Innate and adaptive immunity cooperate flexibly to maintain 
host-microbiota mutualism. Science 325, 617-620 (2009). 

6. Pfeiffer, J. K. & Virgin, H. W. Transkingdom control of viral infection and 
immunity in the mammalian intestine. Science 351, aad5872 (2016). 

7. Underhill, D. M. & Pearlman, E. Immune interactions with pathogenic and 
commensal fungi: a two-way street. /mmunity 43, 845-858 (2015). 

8. Thaiss, C.A., Levy, M., Suez, J. & Elinav, E. The interplay between the innate 
immune system and the microbiota. Curr. Opin. Immunol. 26, 41-48 (2014). 

9. Pott, J. & Hornef, M. Innate immune signalling at the intestinal epithelium in 
homeostasis and disease. EMBO Rep. 13, 684-698 (2012). 

10. Nenci, A. et a/. Epithelial NEMO links innate immunity to chronic intestinal 
inflammation. Nature 446, 557-561 (2007). 

11. Vaishnava, S. et al. The antibacterial lectin Regllly promotes the spatial 
segregation of microbiota and host in the intestine. Science 334, 255-258 
(2011). 

12. Vlantis, K. et a/. TLR-independent anti-inflammatory function of intestinal 
epithelial TRAF6 signalling prevents DSS-induced colitis in mice. Gut 
http://dx.doi.org/10.1136/gutjnI-2014-308323 (2015). 

13. Dannappel, M. et a/. RIPK1 maintains epithelial homeostasis by inhibiting 
apoptosis and necroptosis. Nature 513, 90-94 (2014). 

14. Giinther, C. et al. Caspase-8 regulates TNF-a-induced epithelial necroptosis and 
terminal ileitis. Nature 477, 335-339 (2011). 

15. Takahashi, N. et al. RIPK1 ensures intestinal homeostasis by protecting the 

epithelium against apoptosis. Nature 513, 95-99 (2014). 

16. Welz, P-S. et a/. FADD prevents RIP3-mediated epithelial cell necrosis and 

chronic intestinal inflammation. Nature 477, 330-334 (2011). 

17. Couturier-Maillard, A. et al. NOD2-mediated dysbiosis predisposes mice to 

ransmissible colitis and colorectal cancer. J. Clin. Invest. 123, 700-711 (2013). 

18. Nigro, G., Rossi, R., Commere, P. H., Jay, P. & Sansonetti, P. J. The cytosolic 

bacterial peptidoglycan sensor Nod2 affords stem cell protection and links 

microbes to gut epithelial regeneration. Cell Host Microbe 15, 792-798 (2014). 

19. Ramanan, D., Tang, M. S., Bowcutt, R., Loke, P. & Cadwell, K. Bacterial sensor 

od2 prevents inflammation of the small intestine by restricting the expansion 
of the commensal Bacteroides vulgatus. Immunity 41, 311-324 (2014). 

20. Bouskra, D. et a/. Lymphoid tissue genesis induced by commensals through 

OD1 regulates intestinal homeostasis. Nature 456, 507-510 (2008). 

21. Nordlander, S., Pott, J. & Maloy, K. J. NLRC4 expression in intestinal epithelial 
cells mediates protection against an enteric pathogen. Mucosal Immunol. 7, 
775-785 (2014). 

22. Sellin, M. E. et al. Epithelium-intrinsic NAIP/NLRC4 inflammasome drives 
infected enterocyte expulsion to restrict Salmonella replication in the intestinal 
mucosa. Cell Host Microbe 16, 237-248 (2014). 

23. Allam, R. et a/. Epithelial NAIPs protect against colonic tumorigenesis. J. Exp. 
Med. 212, 369-383 (2015). 

24. Hu, B. etal. Inflammation-induced tumorigenesis in the colon is regulated by 
caspase-1 and NLRC4. Proc. Nat! Acad. Sci. USA 107, 21635-21640 (2010). 

25. Levy, M. et al. Microbiota-modulated metabolites shape the intestinal 
microenvironment by regulating NLRP6 inflammasome signaling. Ce// 163, 
1428-1443 (2015). 

Refs 25 -29 demonstrate the role of epithelial NLRP6 in orchestrating 
antimicrobial peptide production, mucus secretion and viral recognition. 

26. Wlodarska, M. et al. NLRP6 inflammasome orchestrates the colonic host- 
microbial interface by regulating goblet cell mucus secretion. Ce// 156, 
1045-1059 (2014). 

27. Elinav, E. et al. NLRP6 inflammasome regulates colonic microbial ecology and 
risk for colitis. Cell 145, 745-757 (2011). 

28. Normand, S. et a/. Nod-like receptor pyrin domain-containing protein 6 


© 2016 Macmillan Publishers Limited. All rights reserved. 


29. 
30. 


31. 


32. 


33. 


34. 


35. 


36. 


37. 
38. 


39. 


40. 


41. 


42. 


43. 
44. 


45. 


46. 


47. 


48. 


49. 


50. 


51. 


52. 


53. 


54. 


55. 


56. 


57. 


58. 
59. 


60. 


61. 


(NLRP6) controls epithelial self-renewal and colorectal carcinogenesis upon 
injury. Proc. Natl Acad. Sci. USA 108, 9601-9606 (2011). 

Wang, P. et al. Nirp6 regulates intestinal antiviral innate immunity. Science 350, 
826-830 (2015). 

Macia, L. et al. Metabolite-sensing receptors GPR43 and GPR109A 

facilitate dietary fibre-induced gut homeostasis through regulation of the 
inflammasome. Nature Commun. 6, 6734 (2015). 

Singh, N. et al. Activation of Gpr109a, receptor for niacin and the commensal 
metabolite butyrate, suppresses colonic inflammation and carcinogenesis. 
Immunity 40, 128-139 (2014). 

Hu, S. et al. The DNA sensor AIM2 maintains intestinal homeostasis via 
regulation of epithelial antimicrobial host defense. Cel/ Rep. 13, 1922-1936 
(2015). 

Man, S. M. et al. Critical role for the DNA sensor AIM2 in stem cell proliferation 
and cancer. Cel! 162, 45-58 (2015). 

Song-Zhao, G. X. et a/. Nlrp3 activation in the intestinal epithelium protects 
against a mucosal pathogen. Mucosal Immunol. 7, 763-774 (2014). 

Kelly, C. J. et a/. Crosstalk between microbiota-derived short-chain fatty acids 
and intestinal epithelial HIF augments tissue barrier function. Cell Host Microbe 
17, 662-671 (2015). 

Venkatesh, M. et al. Symbiotic bacterial metabolites regulate gastrointestinal 
barrier function via the xenobiotic sensor PXR and Toll-like receptor 4. /mmunity 
41, 296-310 (2014). 

Chimerel, C. et a/. Bacterial metabolite indole modulates incretin secretion from 
intestinal enteroendocrine L cells. Cell Rep. 9, 1202-1208, (2014). 

Mukherji, A., Kobiita, A., Ye, T. & Chambon, P. Homeostasis in intestinal 
epithelium is orchestrated by the circadian clock and microbiota cues 
transduced by TLRs. Ce// 153, 812-827 (2013). 

Thaiss, C. A. & Elinav, E. Exploring new horizons in microbiome research. Cell 
Host Microbe 15, 662-667 (2014). 

Zarrinpar, A., Chaix, A., Yooseph, S. & Panda, S. Diet and feeding pattern affect 
the diurnal dynamics of the gut microbiome. Cell Metab. 20, 1006-1017 
(2014). 

Khosravi, A. et al. Gut microbiota promote hematopoiesis to control bacterial 
infection. Cell Host Microbe 15, 374-381 (2014). 

Balmer, M. L. et a/. Microbiota-derived compounds drive steady-state 
granulopoiesis via MyD88/TICAM signaling. J. /mmunol. 193, 5273-5283 
(2014). 
Trompette, A. et al. Gut microbiota metabolism of dietary fiber influences 
allergic airway disease and hematopoiesis. Nature Med. 20, 159-166 (2014). 
Deshmukh, H. S. et a/. The microbiota regulates neutrophil homeostasis and 
host resistance to Escherichia coli K1 sepsis in neonatal mice. Nature Med. 20, 
524-530 (2014). 
Gomez de Agiiero, M. et al. The maternal microbiota drives early postnatal 
innate immune development. Science 351, 1296-1302 (2016). 

Zhang, D. et al. Neutrophil ageing is regulated by the microbiome. Nature 525, 
528-532 (2015). 

Hill, D. A. et al. Commensal bacteria-derived signals regulate basophil 
hematopoiesis and allergic inflammation. Nature Med. 18, 538-546 (2012). 
Erny, D. et al. Host microbiota constantly control maturation and function of 
microglia in the CNS. Nature Neurosci. 18, 965-977 (2015). 

Tamoutounour, S. et a/. Origins and functional specialization of macrophages 
and of conventional and monocyte-derived dendritic cells in mouse skin. 
Immunity 39, 925-938 (2013). 

Kim, Y. G. et al. Gut dysbiosis promotes M2 macrophage polarization and 
allergic airway inflammation via fungi-induced PGE2. Cell Host Microbe 15, 
95-102 (2014). 

Chang, P. V., Hao, L., Offermanns, S. & Medzhitov, R. The microbial metabolite 
butyrate regulates intestinal macrophage function via histone deacetylase 
inhibition. Proc. Natl Acad. Sci. USA 111, 2247-2252 (2014). 

Bain, C. C. et a/. Constant replenishment from circulating monocytes maintains 
the macrophage pool in the intestine of adult mice. Nature Immunol. 15, 
929-937 (2014). 

Muller, P. A. et al. Crosstalk between muscularis macrophages and enteric 
neurons regulates gastrointestinal motility. Ce// 158, 300-313 (2014); erratum 
158, 1210 (2014). 

Fonseca, D. M. et al. Microbiota-dependent sequelae of acute infection 
compromise tissue-specific immunity. Cel! 163, 354-366 (2015). 

Ganaal, S. C. et al. Priming of natural killer cells by nonmucosal mononuclear 
phagocytes requires instructive signals from commensal microbiota. /mmunity 
37, 171-186 (2012). 

Schwab, L. et a/. Neutrophil granulocytes recruited upon translocation of 
intestinal bacteria enhance graft-versus-host disease via tissue damage. Nature 
Med. 20, 648-654 (2014). 

Oh, J. Z. et al. TLR5-mediated sensing of gut microbiota is necessary for 
antibody responses to seasonal influenza vaccination. /mmunity 41, 478-492 
(2014). 

lida, N. et al. Commensal bacteria control cancer response to therapy by 
modulating the tumor microenvironment. Science 342, 967-970 (2013). 
Sawa, S. et al. Lineage relationship analysis of RORyt* innate lymphoid cells. 
Science 330, 665-669 (2010). 

Sanos, S. L. et al. RORyt and commensal microflora are required for the 
differentiation of mucosal interleukin 22-producing NKp46* cells. Nature 
Immunol. 10, 83-91 (2009). 

Satoh-Takayama, N. et a/. Microbial flora drives interleukin 22 production in 
intestinal NKp46* cells that provide innate mucosal immune defense. /mmunity 
29, 958-970 (2008). 


62. 


63. 


71. 


72. 


79. 


80. 


81. 


82. 


83. 


84. 


85. 


86. 


87. 
88. 


89. 


90. 


Ol. 


92. 


93. 


94. 


95. 


. Levy, M., 


REVIEW 


Sawa, S. et al. RORyt* innate lymphoid cells regulate intestinal homeostasis by 
integrating negative signals from the symbiotic microbiota. Nature Immunol. 12, 
320-326 (2011). 

Sonnenberg, G. F. et al. Innate lymphoid cells promote anatomical containment 
of lymphoid-resident commensal bacteria. Science 336, 1321-1325 (2012). 
Refs 63-65 demonstrate a role for innate lymphoid cells in the local 
containment of the microbiota and in regulating T-cell responses to the 
microbiota. 


. Hepworth, M. R. et a/. Innate lymphoid cells regulate CD4* T-cell responses to 


intestinal commensal bacteria. Nature 498, 113-117 (2013). 


. Hepworth, M. R. et a/. Group 3 innate lymphoid cells mediate intestinal selection 


of commensal bacteria-specific CD4* T cells. Science 348, 1031-1035 (2015). 


. Mortha, A. et al. Microbiota-dependent crosstalk between macrophages and 


ILC3 promotes intestinal homeostasis. Science 343, 1249288 (2014). 


. Kinnebrew, M. A. et al. Interleukin 23 production by intestinal CD103*CD11b* 


dendritic cells in response to bacterial flagellin enhances mucosal innate 
immune defense. /mmunity 36, 276-287 (2012). 


. Kruglov, A. A. et al. Nonredundant function of soluble LTa3 produced by innate 


lymphoid cells in intestinal homeostasis. Science 342, 1243-1246 (2013). 


. Goto, Y. et al. Innate lymphoid cells regulate intestinal epithelial cell 


glycosylation. Science 345, 1254009 (2014). 


. Sonnenberg, G. F. & Artis, D. Innate lymphoid cell interactions with microbiota: 


implications for intestinal health and disease. /mmunity 37, 601-610 (2012). 
von Moltke, J., Ji, M., Liang, H. E. & Locksley, R. M. Tuft-cell-derived IL-25 
regulates an intestinal ILC2-epithelial response circuit. Nature 529, 221-225 
(2016). 
Powell, N. et al. The transcription factor T-bet regulates intestinal inflammation 
mediated by interleukin-7 receptor® innate lymphoid cells. /mmunity 37, 
674-684 (2012). 

Thaiss, C. A. & Elinav, E. Metagenomic cross-talk: the regulatory 
interplay between immunogenomics and the microbiome. Genome Med. 7, 
120 (2015). 


. Petnicki-Ocwieja, T. et al. Nod2 is required for the regulation of commensal 


microbiota in the intestine. Proc. Natl Acad. Sci. USA 106, 15813-15818 
(2009). 


. Vijay-Kumar, M. et al. Metabolic syndrome and altered gut microbiota in mice 


lacking Toll-like receptor 5. Science 328, 228-231 (2010). 


. Pickard, J. M. et al. Rapid fucosylation of intestinal epithelium sustains host- 


commensal symbiosis in sickness. Nature 514, 638-641 (2014). 


. Kamdar, K. et al. Genetic and metabolic signals during acute enteric bacterial 


infection alter the microbiota and drive progression to chronic inflammatory 
disease. Cell Host Microbe 19, 21-31 (2016). 


. Ubeda, C. et a/. Familial transmission rather than defective innate immunity 


shapes the distinct intestinal microbiota of TLR-deficient mice. J. Exp. Med. 209, 
1445-1456 (2012). 

Hooper, L. V. et al. Molecular analysis of commensal host-microbial 
relationships in the intestine. Science 291, 881-884 (2001). 

This study provided initial insight into the effects of commensal bacteria on 
genome-wide transcriptional reprogramming. 

Kernbauer, E., Ding, Y. & Cadwell, K. An enteric virus can replace the beneficial 
function of commensal bacteria. Nature 516, 94-98 (2014). 

Rakoff-Nahoum, S. et a/. Analysis of gene-environment interactions in postnatal 
development of the mammalian intestine. Proc. Nat! Acad. Sci. USA 112, 
1929-1936 (2015). 

Sommer, F., Nookaew, I., Sommer, N., Fogelstrand, P. & Backhed, F. Site-specific 
programming of the host epithelial transcriptome by the gut microbiota. 
Genome Biol. 16, 62 (2015). 

Leone, V. et al. Effects of diurnal variation of gut microbes and high-fat feeding 
on host circadian clock function and metabolism. Cel! Host Microbe 17, 
681-689 (2015). 
Rawls, J. F., Mahowald, M. A., Ley, R. E. & Gordon, J. |. Reciprocal gut microbiota 
transplants from zebrafish and mice to germ-free recipients reveal host habitat 
selection. Cell 127, 423-433 (2006). 

Chung, H. et a/. Gut immune maturation depends on colonization with a host- 
specific microbiota. Cel! 149, 1578-1593 (2012). 

Patrick, S. et al. A unique homologue of the eukaryotic protein-modifier 
ubiquitin present in the bacterium Bacteroides fragilis, a predominant resident 
of the human gastrointestinal tract. Microbiology 157, 3071-3078 (2011). 
Neish, A. S. et al. Prokaryotic regulation of epithelial responses by inhibition of 
IkB-a ubiquitination. Science 289, 1560-1563 (2000). 
Kumar, A. et al. Commensal bacteria modulate cullin-dependent signaling via 
generation of reactive oxygen species. EMBO J. 26, 4457-4466 (2007). 

Kelly, D. et al. Commensal anaerobic gut bacteria attenuate inflammation by 
regulating nuclear-cytoplasmic shuttling of PPAR-y and RelA. Nature Immunol. 
5, 104-112 (2004). 
Zhang, Q. et al. Commensal bacteria direct selective cargo sorting to promote 
symbiosis. Nature Immunol. 16, 918-926 (2015). 
Camp, J. G. et al. Microbiota modulate transcription in the intestinal epithelium 
without remodeling the accessible chromatin landscape. Genome Res. 24, 
1504-1516 (2014). 

Takahashi, K. et a/. Epigenetic control of the host gene by commensal bacteria 
in large intestinal epithelial cells. J. Biol. Chem. 286, 35755-35762 (2011). 
Olszak, T. et a/. Microbial exposure during early life has persistent effects on 
natural killer T cell function. Science 336, 489-493 (2012). 

Alenghat, T. et al. Histone deacetylase 3 coordinates commensal-bacteria- 
dependent intestinal homeostasis. Nature 504, 153-157 (2013). 

Chen, G. Y., Liu, M., Wang, F., Bertin, J. & Nunez, G. A functional role for Nirpé 


7 JULY 2016 | VOL 535 | NATURE | 73 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


in intestinal inflammation and tumorigenesis. J. Immunol. 186, 7187-7194 
(2011). 

96. Jiang, W. et a/. Recognition of gut microbiota by NOD2 is essential for the 
homeostasis of intestinal intraepithelial lymphocytes. J. Exp. Med. 210, 
2465-2476 (2013). 

97. Kobayashi, K. S. et al. Nod2-dependent regulation of innate and adaptive 
immunity in the intestinal tract. Science 307, 731-734 (2005). 

98. Franchi, L. et a/. NLRC4-driven production of IL-1 discriminates between 
pathogenic and commensal bacteria and promotes host intestinal defense. 
Nature Immunol. 13, 449-456 (2012). 

99. Zheng, Y. et al. Interleukin-22 mediates early host defense against attaching 
and effacing bacterial pathogens. Nature Med. 14, 282-289 (2008). 

100. Diehl, G. E. et al. Microbiota restricts trafficking of bacteria to mesenteric lymph 
nodes by CX3CR1" cells. Nature 494, 116-120 (2013). 

101.Macpherson, A. J. & Uhr, T. Induction of protective IgA by intestinal dendritic 
cells carrying commensal bacteria. Science 303, 1662-1665 (2004). 

This seminal study defined the lymph-node-restricted ‘firewall’ circuits that 
control the local containment of the microbiota. 

102.Sano, T. et al. An IL-23R/IL-22 circuit regulates epithelial serum amyloid A to 
promote local effector Th17 responses. Cel! 163, 381-393 (2015). 

103.Naik, S. et al. Commensal-dendritic-cell interaction specifies a unique 
protective skin immune signature. Nature 520, 104-108 (2015). 

104.Balmer, M. L. et a/. The liver may act as a firewall mediating mutualism between 
the host and its gut commensal microbiota. Sci. Transl. Med. 6, 237ra66 (2014). 

105.Zeissig, S. & Blumberg, R. S. Life at the beginning: perturbation of the 
microbiota by antibiotics in early life and its role in health and disease. Nature 
Immunol. 15, 307-310 (2014). 

106.Henao-Mejia, J. et al. Inflammasome-mediated dysbiosis regulates progression 
of NAFLD and obesity. Nature 482, 179-185 (2012). 

107.Luo, Y. et al. Microbiota from obese mice regulate hematopoietic stem cell 
differentiation by altering the bone niche. Cel! Metab. 22, 886-894 (2015). 

108.Abt, M. C. et al. Commensal bacteria calibrate the activation threshold of innate 
antiviral immunity. Immunity 37, 158-170 (2012). 

109.Ichinohe, T. et a/. Microbiota regulates immune defense against respiratory tract 
influenza A virus infection. Proc. Natl Acad. Sci. USA 108, 5354-5359 (2011). 

110.Nice, T. J. et a/. Interferon-\ cures persistent murine norovirus infection in the 
absence of adaptive immunity. Science 347, 269-273 (2015). 

111.Zhang, B. et al. Prevention and cure of rotavirus infection via TLR5/NLRC4- 
mediated production of IL-22 and IL-18. Science 346, 861-865 (2014). 

112.Hernandez, P. P. et al. Interferon-A and interleukin 22 act synergistically for the 
induction of interferon-stimulated genes and control of rotavirus infection. 
Nature Immunol. 16, 698-707 (2015). 

113.Baldridge, M. T. et a. Commensal microbes and interferon-A determine 
persistence of enteric murine norovirus infection. Science 347, 266-269 
(2015). 

114. Kane, M. et a/. Successful transmission of a retrovirus depends on the 
commensal microbiota. Science 334, 245-249 (2011). 

115.Kuss, S. K. et al. Intestinal microbiota promote enteric virus replication and 
systemic pathogenesis. Science 334, 249-252 (2011). 

116.Guo, X. et al. Innate lymphoid cells control early colonization resistance against 
intestinal pathogens through ID2-dependent regulation of the microbiota. 
Immunity 42, 731-743 (2015). 

117.Kamada, N. et al. Regulated virulence controls the ability of a pathogen to 
compete with the gut microbiota. Science 336, 1325-1329 (2012). 

118.Behnsen, J. et al. The cytokine IL-22 promotes pathogen colonization by 
suppressing related commensal bacteria. Immunity 40, 262-273 (2014). 

119.Maekawa, T. et al. Porphyromonas gingivalis manipulates complement and TLR 
signaling to uncouple bacterial clearance from inflammation and promote 
dysbiosis. Cell Host Microbe 15, 768-778 (2014). 

120.Carvalho, F. A. et al. Transient inability to manage proteobacteria promotes 
chronic gut inflammation in TLR5-deficient mice. Cell Host Microbe 12, 
139-152 (2012). 

121.Rigottier-Gois, L. Dysbiosis in inflammatory bowel diseases: the oxygen 
hypothesis. /SME J. 7, 1256-1261 (2013). 

122.Hugot, J. P. et al. Association of NOD2 leucine-rich repeat variants with 
susceptibility to Crohn’s disease. Nature 411, 599-603 (2001). 

123.Ogura, Y. et al. A frameshift mutation in NOD2 associated with susceptibility to 
Crohn’s disease. Nature 411, 603-606 (2001). 

124.Hampe, J. et al. A genome-wide association scan of nonsynonymous SNPs 
identifies a susceptibility variant for Crohn disease in ATG16L1. Nature Genet. 
39, 207-211 (2007). 

125.Rioux, J. D. et al. Genome-wide association study identifies new susceptibility 
loci for Crohn disease and implicates autophagy in disease pathogenesis. 
Nature Genet. 39, 596-604 (2007). 

126. lliev, |. D. et a/. Interactions between commensal fungi and the C-type lectin 


74 | NATURE | VOL 535 | 7 JULY 2016 


receptor Dectin-1 influence colitis. Science 336, 1314-1317 (2012). 

127.Kostic, A. D. et al. The dynamics of the human infant gut microbiome in 
development and in progression toward type 1 diabetes. Cel! Host Microbe 17, 
260-273 (2015). 

128.Wen, L. et a/. Innate immunity and intestinal microbiota in the development of 
type 1 diabetes. Nature 455, 1109-1113 (2008). 

129.Scher, J. U. et al. Expansion of intestinal Prevotella copri correlates with 
enhanced susceptibility to arthritis. eLife 2, e€01202 (2013). 

130.Stefka, A. T. et al. Commensal bacteria protect against food allergen 
sensitization. Proc. Nat! Acad. Sci. USA 111, 13145-13150 (2014). 

131.Ji, Y. et al. Diet-induced alterations in gut microflora contribute to lethal 
pulmonary damage in TLR2/TLR4-deficient mice. Cel! Rep. 8, 137-149 (2014). 

132.Donath, M. Y. & Shoelson, S. E. Type 2 diabetes as an inflammatory disease. 
Nature Rev. Immunol. 11, 98-107 (2011). 

133.Turnbaugh, P. J. et al. An obesity-associated gut microbiome with increased 
capacity for energy harvest. Nature 444, 1027-1031 (2006). 
One of the first studies to link dysbiosis to disease. 

134.Wang, X. et al. Interleukin-22 alleviates metabolic disorders and restores 
mucosal immunity in diabetes. Nature 514, 237-241 (2014). 

135.Le Chatelier, E. et al. Richness of human gut microbiome correlates with 
metabolic markers. Nature 500, 541-546 (2013). 

136.Yang, T. et al. Gut dysbiosis is linked to hypertension. Hypertension 65, 
1331-1340 (2015). 

137.Rune, |. et al. Modulating the gut microbiota improves glucose tolerance, 
lipoprotein profile and atherosclerotic plaque development in ApoE-deficient 
mice. PLoS ONE 11, e0146439 (2016). 

138.Koren, O. et al. Human oral, gut, and plaque microbiota in patients with 
atherosclerosis. Proc. Natl Acad. Sci. USA 108 (suppl.), 4592-4598 (2011). 

139.Koeth, R. A. et al. Intestinal microbiota metabolism of L-carnitine, a nutrient in 
red meat, promotes atherosclerosis. Nature Med. 19, 576-585 (2013). 
Refs 139-141 explore the causative involvement of specific bacterial 
metabolites in metabolic disease. 

140.Wang, Z. et al. Gut flora metabolism of phosphatidylcholine promotes 
cardiovascular disease. Nature 472, 57-63 (2011). 

141.Wang, Z. et al. Non-lethal inhibition of gut microbial trimethylamine production 
for the treatment of atherosclerosis. Ce// 163, 1585-1595 (2015). 

142.Jin, C., Henao-Mejia, J. & Flavell, R. A. Innate immune receptors: key regulators 
of metabolic disease progression. Cell Metab. 17, 873-882 (2013). 

143.Irrazabal, T., Belcheva, A., Girardin, S. E., Martin, A. & Philpott, D. J. The 
multifaceted role of the intestinal microbiota in colon cancer. Mol. Cell 54, 
309-320 (2014). 

144. Grivennikov, S. |. et al. Adenoma-linked barrier defects and microbial products 
drive IL-23/IL-17-mediated tumour growth. Nature 491, 254-258 (2012). 
145.Sivan, A. et al. Commensal Bifidobacterium promotes antitumor immunity and 

facilitates anti-PD-L1 efficacy. Science 350, 1084-1089 (2015). 
146.Ahern, P. P,, Faith, J. J. & Gordon, J. |. Mining the human gut microbiota for 
effector strains that shape the immune system. Immunity 40, 815-823 (2014). 
147.Zmora, N., Zeevi, D., Korem, T., Segal, E. & Elinav, E. Taking it personally: 
personalized utilization of the human microbiome in health and disease. Cell 
Host Microbe 19, 12-20 (2016). 


Acknowledgements We apologize to those authors whose relevant work could 

not be included owing to space constraints. We thank the members of the Elinav 
laboratory for discussions. C.A.T. received a Boehringer Ingelheim Fonds PhD 
fellowship. N.Z. is supported by the Gilead Sciences International Research 
Scholars Program in Liver Disease. E.E. is supported by: Y. and R. Ungar; the Abisch 
Frenkel Foundation for the Promotion of Life Sciences; the Gurwin Family Fund 

for Scientific Research; the Leona M. and Harry B. Helmsley Charitable Trust; the 
Crown Endowment Fund for Immunological Research; the estate of J. Gitlitz; the 
estate of L. Hershkovich; the Benoziyo Endowment Fund for the Advancement of 
Science; the Adelis Foundation; J. L. and V. Schwartz; A. and G. Markovitz; A. and C. 
Adelson; the French National Center for Scientific Research (CNRS); D. L. Schwarz; 
the V. R. Schwartz Research Fellow Chair; L. Steinberg; J. N. Halpern; A. Edelheit; 
grants funded by the European Research Council; a Marie Curie Career Integration 
Grant; the German-lsraeli Foundation for Scientific Research and Development; 
the Israel Science Foundation; the Minerva Foundation; the Rising Tide Foundation; 
the Helmholtz Association; and the European Foundation for the Study of Diabetes. 
E.E. is the incumbent of the Rina Gudinski Career Development Chair. 


Author information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of this 
paper at go.nature.com/28j8r5I. Correspondence should be addressed to E.E. 
(eran.elinav@weizmann.ac.il). 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


doi:10.1038/nature18848 


The microbiota in adaptive immune 
homeostasis and disease 


Kenya Honda’ & DanR. Littman** 


In the mucosa, the immune system’s T cells and B cells have position-specific phenotypes and functions that are influenced 
by the microbiota. These cells play pivotal parts in the maintenance of immune homeostasis by suppressing responses 
to harmless antigens and by enforcing the integrity of the barrier functions of the gut mucosa. Imbalances in the gut 
microbiota, known as dysbiosis, can trigger several immune disorders through the activity of T cells that are both near 
to and distant from the site of their induction. Elucidation of the mechanisms that distinguish between homeostatic and 
pathogenic microbiota—host interactions could identify therapeutic targets for preventing or modulating inflammatory 
diseases and for boosting the efficacy of cancer immunotherapy. 


icrobiotas that establish mutualistic relationships with their 
Menai hosts are able to influence a multitude of physi- 

ological functions, often through modulation of the host’s 
immune system. Certain bacteria that inhabit defined niches transmit dis- 
tinct signals that affect functions of both the innate and adaptive immune 
systems, which often results in systemic outcomes that are distal to the 
site of colonization. For example, segmented filamentous bacteria (SFB) 
induce T helper 17 (T;17) cells in the small intestine and can trigger 
autoimmune arthritis in susceptible mice’”. Some species of Bifidobacte- 
rium can enhance the T-cell-dependent anti-tumour effect of blocking the 
programmed death 1 (PD-1) pathway’, and regulatory T (T,,,) cells that 
are induced by bacteria can have systemic anti-inflammatory functions”. 
There are only a handful of examples of single species or defined commu- 
nities of bacteria that can be used to provide insight into the mechanisms 
by which distinct subsets of lymphocytes are activated and polarized. 
Efforts to culture and characterize the commensal bacteria of humans 
and to assess their influence on the host’s immune system, which typi- 
cally involve the colonization of germ-free mice, promise to provide new 
tools for investigating which cell types and signalling pathways are crucial 
for the induction of distinct immune responses. The characterization of 
IgA-coated gut bacteria from mice and humans, which provides a snap- 
shot of the bacteria that are sensed by the cells of the adaptive immune 
system, has also been a valuable advance. This approach has been used 
to identify bacteria with potentially colitogenic functions in people with 
malnutrition® and in individuals with inflammatory bowel disease’ as 
well as to compare species of bacteria that elicit T-cell-dependent and 
T-cell-independent IgA-mediated responses in the host’. 

In this Review, we describe progress towards understanding how colo- 
nization of the mammalian host by microbes influences the functional 
diversity and the repertoires of B cells and T cells, with an emphasis on 
the differentiation of IgA-producing B cells and T cells that carry the CD4 
antigen, particularly T,,17 cells and T,,., cells that constitute a large propor- 
tion of the effector T (Tq) cells of the lamina propria of the intestines. The 
reciprocal roles of lymphocytes in regulating the microbiota, a topic that 
has so far received little attention, will also be discussed briefly. It should 
be noted that insights into the interactions of the microbiota with the 
immune cells of the host tend to come from studies of mice in controlled 
environments, which have limited exposure to pathogenic microbes 
or to the microbiota of wild populations. Housing of laboratory mice 


together with free-living wild mice results in a constitutive increase in 
highly differentiated innate and adaptive immune cells, including effector 
memory T cells that carry the CD8 antigen, in the laboratory mice’. The 
immune profile of these mice matches that of adult humans much more 
closely than does that of mice kept in specific pathogen-free conditions. 
The failure of some mouse studies to predict the responses of humans to 
therapy could therefore be partly because of differences in the microbiotas 
of the species. 


Interactions of the microbiota with B cells and T cells 
Studies have suggested roles for diverse species of microbes in regulat- 
ing the distinct branches of the adaptive immune system. Antigen-spe- 
cific adaptive immune responses influence the mutualistic relationship 
between the microbiota and the host, and are mostly directed at the 
microbes of the gut. 


IgA 

Mucosal IgA is secreted across the epithelium by binding to the polymeric 
immunoglobulin receptor, after which it binds to microbes, various com- 
ponents of the diet and to antigens in the lumen of the intestine. IgA coats 
and agglutinates its targets to prevent their direct interaction with the 
host'*"", This averts potentially harmful stimulation of the immune sys- 
tem in mucous membranes by the contents of the lumen and it also serves 
to regulate the composition of the microbiota. As well as providing a phys- 
ical barrier, IgA can control the expression of genes by microbes in the 
intestine. For example, in the absence of IgA, the commensal bacterium 
Bacteroides thetaiotaomicron, which typically does not trigger inflam- 
mation in the human gut, expresses high levels of gene products that are 
involved in the metabolism of nitric oxide and elicits pro-inflammatory 
signals in the host’. Similarly, mice that are deficient for Toll-like recep- 
tor 5 (TLR5) show reduced levels of IgA that is directed against the protein 
flagellin, which results in aberrant expression of flagella-related genes by 
a wide range of commensal microbes"’. IgA that has undergone affinity 
maturation through somatic hypermutation binds to and selects for par- 
ticular components of the microbiota, which leads to an increase in the 
diversity of the microbial community and enhances mutualism between 
the microbiota and the host’*. Consistent with this observation, people 
who are deficient in IgA have more bacteria from taxa with potentially 
inflammatory properties’’. Moreover, mice that carry a mutation called 


‘Department of Microbiology and Immunology, Keio University School of Medicine, Shinjuku, Tokyo 160-8582, Japan. *RIKEN Center for Integrative Medical Sciences, Tsurumi, Yokohama, Kanagawa 
230-0045, Japan. °AMED-CREST, Chiyoda, Tokyo 100-0004, Japan. “The Helen L. and Martin S. Kimmel Center for Biology and Medicine at the Skirball Institute of Biomolecular Medicine, New York 
University School of Medicine, New York, New York 10016, USA. The Howard Hughes Medical Institute, New York University School of Medicine, New York, New York 10016, USA. 


7 JULY 2016 | VOL 535 | NATURE | 75 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


Peyer’s patch 


Segmented Mucispirillum 


filamentous 


Sutterella 


“\ > = Degradation 


bacterium a1 } i) Lumen 
lnkteesaline! ; C. scindens A. muciniphila 
epithelial cell \ Antigens Antigens 
Dendritic ie 3 4 } y Mucus layer 
lel “ “ selelalals 
Naive T cell ILC3 High-affinity, Lamina 
microbiota- Low-affinity IgA propria 


adapted IgA 


ae 


™@® 
Proliferation CD40L 
and somatic IL-21 Class-switch 


hypermutation recombination 
@o-§-e-—e- 


B cell AID* "ae 


“@ 


IgE* 


IgA* 


No microbial 
stimulation 


T-cell-dependent 
IgA induction 


Re-entry to germinal centre 


Figure 1 | Induction of IgA in mucosal tissues. T-cell-dependent IgA 
class-switch recombination (left) takes place mostly in Peyer’s patches, in 
which dendritic cells that are located near to the surface of the epithelium 
capture antigens from microbes that are transferred by M cells. Dendritic 
cells induce the differentiation of CD4-expressing T cells into the T follicular 
helper (Ty) cell subset. CD40 ligand (CD40L) and IL-21 from T,, cells 
induce the expression of activation-induced cytidine deaminase (AID) in 

B cells and promote IgA class-switch recombination’. T-cell-independent 
IgA class-switch recombination (right) occurs predominantly in the lamina 
propria and isolated lymphoid follicles (ILFs), where B-cell activating factor 
(BAFF; also known as TNFSF13B) and its homologue APRIL, which are 
derived from dendritic cells, promote the induction of AID expression in 

B cells. Transforming growth factor 6 (from dendritic cells and stromal cells) 


AID*°*** that allows the enzyme activation-induced cytidine deaminase 
to mediate normal IgA class switching but without somatic hypermuta- 
tion, harbour a dysbiotic microbiota in their small intestine’®. Selection of 
affinity-matured, microbe-specific IgA is therefore crucial for the estab- 
lishment ofa balanced microbiota that, in turn, can restrain inflammatory 
processes. 

Gut plasma cells that produce IgA can be generated by both T-cell- 
dependent and T-cell-independent mechanisms that involve the 
cooperation of epithelial cells, dendritic cells and innate lymphoid cells 
(ILCs) (Fig. 1 and Box 1). In both pathways, the gut microbiota affects the 
accumulation of cells that express IgA as well as the level and diversity of 
IgA in the lumen. Indeed, IgA-expressing cells in lymphoid tissue known 
as Peyer's patches and in the lamina propria are greatly reduced in germ- 
free animals, and the colonization of germ-free mice with a microbiota 
quickly triggers the production of IgA. Interestingly, some members of the 
microbiota, such as species of Sutterella, are inversely correlated with the 
level of IgA in faeces'’. These members degrade both IgA and a peptide 
that is required for the stability of IgA in the lumen, known as secre- 
tory component. Because microbiota-induced IgAs are directed towards 
microbial antigens’, a substantial proportion of the microbiota are coated 
with IgA and can be detected and characterized through flow cytometry 
and 16S ribosomal RNA gene sequencing. Known as IgA-SEQ, this com- 
bined approach has demonstrated that anatomical location determines 
whether a particular species of bacterium will elicit an IgA-mediated 
response in the host*. Bacteria that can invade the inner mucous layer of 
the intestine and colonize regions in proximity to epithelial cells induce 


76 | NATURE | VOL 535 | 7 JULY 2016 


IgA* B-cell pool 


BAFF 
APRIL 
Class-switch 


recombination 
-@-—0-O 


AID* 


IgA* 


Isolated lymphoid follicle 


T-cell-independent 
IgA induction 


and retinoic acid (from dietary vitamin A) play important parts (not shown) 
in both T-cell-dependent and T-cell-independent pathways. ILC3s that 
express RORyt also contribute to those pathways, through the expression 

of lymphotoxin (LT)-a and LT-f, which activate dendritic cells'”. The gut 
microbiota affects IgA class-switch recombination in both pathways. The 
T-cell-independent pathway produces IgA with low affinity but directed 
towards the microbiota. The T-cell-dependent pathway tends to be activated 
by bacteria that colonize the surface of the epithelium, such as segmented 
filamentous bacteria (SFB), Mucispirillum, Clostridium scindens and 
Akkermansia muciniphila. The IgA-expressing B-cell clones that this pathway 
induces persist for a long time and can re-enter a germinal centre, where they 
undergo further somatic hypermutation to produce high-affinity IgA that is 
adapted to the changing composition of the microbiota. 


high-affinity T-cell-dependent IgA responses”*"*. In particular, SFB and 
Mucispirillum associate intimately with the intestinal epithelium, where 
they elicit a T-cell-dependent IgA-mediated response and are heavily 
coated with IgA* (Fig. 1). Because SFB have a propensity to induce the 
production of T,;17 cells, they might also induce follicular helper (Ty) 
cells with a phenotype that is distinct from those of T;,, cells that are 
induced by other commensal bacteria, thereby resulting in a strong, T,17- 
cell-dependent high-affinity IgA response’. Mice that are deficient in 
T cells owing to a lack of T-cell antigen receptor (TCR) chains 6 and 6, 
as well as those that lack T;,,, cells and the T-cell-dependent IgA pathway 
owing to T-cell-specific inactivation of the gene Bcl6 in CD4" T cells, 
retain an IgA-mediated response that is specific to antigens from com- 
mensal bacteria — indicating that the T-cell-independent pathway is also 
directed at the microbiota*®. However, this response is characterized largely 
by the low-affinity binding of IgA to antigens that are shared by multiple 
species of bacteria”’’"*. 

Induced clones of IgA-producing B cells persist for long periods, even 
after transient exposure to microbes” (Fig. 1). Accordingly, an increase 
in the complexity of the gut microbiota leads to an increase in the diversity 
of the IgA pool”. The repertoire of IgA in the gut is dynamically adjusted 
in response to changes in the composition of the microbiota”. This pro- 
cess of adaptation relies mostly on the re-entry of B-cell clones into a 
germinal centre and on further somatic hypermutation of B-cell clones 
that are already established in the pool of plasma cells in the intestine”. 
The types of gut microbes that are targeted by IgA change in accordance 
with the diet of the host. For example, in mice colonized with the gut 


© 2016 Macmillan Publishers Limited. All rights reserved. 


microbiotas of undernourished children and fed a nutrient-poor diet, 
members of the Enterobacteriaceae are heavily coated with IgA°. By con- 
trast, in mice that are colonized by the same microbiotas but fed a nutri- 
tionally sufficient diet, IgA binds to taxa other than Enterobacteriaceae, 
even though the load of Enterobacteriaceae is similar. The transfer of 
Enterobacteriaceae-enriched consortia of IgA-coated microbes leads to 
a severe enteropathy that is characterized by disruption of the epithelial 
barrier of the intestine and by weight loss, which suggests that bacteria 
that are heavily coated with IgA are colitogenic®. Consistent with this 
idea, IgA-coated bacteria that are isolated from people with inflamma- 
tory bowel disease promote dramatically exacerbated development of 
colitis induced by dextran sulfate sodium’. However, enteropathy that 
is induced by colitogenic bacteria can be prevented by the administra- 
tion of IgA-targeted species of bacteria from healthy microbiotas, such 
as Akkermansia muciniphila and Clostridium scindens®. Bacteria that are 
targeted by IgA are therefore not always colitogenic; they can even be of 
benefit to the host through contributions to enhancing the barrier func- 
tion of the mucosa. 


Tyl7 cells 

The high-affinity secretory IgA response is proposed to depend largely 
on T,;17 cells that express RAR-related orphan receptor (ROR)yt"”. These 
cells are most abundant in the lamina propria of the intestine, where they 
account for 30-40% of differentiated memory CD4* T cells”. The sig- 
nature cytokines of T,;17 cells, interleukin (IL)-17A, IL-17F and IL-22, 
stimulate the production of antimicrobial proteins by intestinal epithelial 
cells as well as the formation of tight junctions between these cells”. They 
also mediate the transportation of IgA and the recruitment of granulo- 
cytes. Consequently, T};17 cells have an indispensable role in preventing 
infection by several species of extracellular pathogenic bacteria and fungi. 
Indeed, genetic defects in the IL-17-IL-17 receptor axis and in RORyt in 
humans have been linked to susceptibility to chronic mucocutaneous 
candidiasis”, anda deficiency of both 17a and II17fin mice results 
in opportunistic infection of mucocutaneous zones by Staphylococcus 
aureus”, However, T,17 cells can also have pathogenic features, particu- 
larly following their stimulation with IL-23 and IL-1B””*. Pathogenic 
Ty17 cells express the pro-inflammatory cytokines interferon (IFN)-y 
and granulocyte-macrophage colony-stimulating factor (GM-CSF; also 
known as CSF2) and exacerbate autoimmune and inflammatory dis- 
eases’ **. IL-23 is required for the conversion of IL-17-expressing T cells 
into encephalitogenic and colitogenic T cells that express both IL-17 
and IFN-y or only IFN-y (known as ex-T};17 cells, T,17.1 cells or T,,1* 
cells)*”* and for the onset of disease in mice that are subjected to coli- 
tis’ and to experimental autoimmune encephalomyelitis””’. Although 
both homeostatic T,,17 cells and pathogenic T,;17 cells are dependent 
on RORyt in combination with other factors’? for their differentiation, 
what distinguishes the T,,17 cells that promote homeostatic defence of 
the gut barrier from those that are involved in pathogenic inflammation 
is a major unanswered question. 

It is unclear whether constituents of the microbiota or other environ- 
mental factors direct the differentiation of naive CD4" T cells into homeo- 
static or pathogenic T,,17 cells. In experimental models, a multitude of 
environmental factors have been shown to affect the activation status of 
intestinal T,,17 cells. For example, a diet that is high in salt enhances the 
number of T cells in the intestinal lamina propria that express IL-17A and 
CD4 and increases the risk of T,;17-cell-dependent autoimmunity”. 
These phenotypes are ascribed to the salt-mediated induction of serine/ 
threonine-protein kinase Sgk1 (SGK1), which phosphorylates and deacti- 
vates forkhead box protein O1, thereby relieving the inhibition of RORyt- 
mediated transcription of IL-17A and the IL-23 receptor and promoting 
the generation of pathogenic Ty17 cells”. 

Lipids in the diet have also been implicated in promoting the differen- 
tiation of both T,,17 cells and T,,., cells**°. Long-chain fatty acids such 
as lauric acid promote the differentiation of T,,17 cells and induce more 
severe experimental autoimmune encephalomyelitis, whereas the short- 
chain fatty acid propionic acid protects animals from disease, in part 


REVIEW 


BOX1 
ILC3s in adaptive immune 
homeostasis 


Signals from the microbiota create complex interactions between 
epithelial cells, dendritic cells, macrophages and ILC3s. ILC3s 
contribute to the differentiation of T cells and B cells. For example, 
ILC3s express lymphotoxin (LT)-a and LT-B and activate dendritic 
cells, thereby contributing to both T-cell dependent and T-cell- 
independent pathways of IgA class switching’. ILC3s also facilitate 
the induction of T,,17 cells through the production of IL-22 and 
other factors. Activation of ILC3s and induction of T,,17 cells 

have been observed in mice that are colonized by segmented 
filamentous bacteria (SFB) and other bacteria, including Citrobacter 
rodentium®°!*°131, Activation of ILC3s by these bacteria requires 

the TLR-dependent activation of CX;CR1-expressing cells (derived 
from monocytes) and their production of IL-23, IL-18 and tumour 
necrosis factor ligand superfamily member 15 (TNFSF15), which 
act through receptors on ILC3s!%". IL-22 from ILC3s then activates 
epithelial cells to produce serum amyloid A and other factors that 
are required for the induction of T,17 cells. 

Latent infection of wild-type mice with murine norovirus, which 
induces pathogenesis in the intestines of mice that lack the gene 
Atg16/1 (ref. 132), leads to IL-22 production by ILC3s and the 
induction of T,,17 cells, while suppressing the expansion of group 2 
innate lymphoid cells (ILC2s) — offsetting the deleterious effect 
of treatment with antibiotics!**. Viral components of the intestinal 
microbiota could therefore act with commensal bacteria to reinforce 
the epithelial barrier through activation of ILC3s and induction of 
Ty17 cells. 

ILC3s that are activated by the microbiota also promote 
expansion Of T,.. cells'©°, Gut microbiota induce the production of 
IL-18 from macrophages in the lamina propria, and this cytokine 
acts on neighbouring ILC3s to activate their production of CSF2 
(ref. 106). CSF2 then acts on CD103-expressing dendritic cells 
in the colon to enhance the activity of aldehyde dehydrogenase 
(ALDH) and produce TGF-8 and IL-10, which induces T,., cells’, 


through the induction of T,., cells“. Endogenous fatty acids, which are 
dependent on the enzyme acetyl-CoA carboxylase 1 for their synthesis, 
contribute to the differentiation of T,,17 cells and to the development of 
autoimmune diseases”. It has also been suggested that an intermediate 
in cholesterol biosynthesis acts as an endogenous ligand for RORyt and 
that enzymes such as CYP51A1 and SC4MOL (also known as MSMO1), 
which form part of the cholesterol biosynthesis pathway, contribute to 
T,17 cell differentiation”. These enzymes are upregulated in pathogenic 
T,17 cells on their culture with saturated fatty acids, such as palmitic acid, 
or with IL-23 (ref. 44) (Fig. 2). In the absence of IL-23, non-pathogenic 
Ty17 cells express the protein CDSL, an inhibitor of fatty-acid synthase, 
and these cells have elevated levels of polyunsaturated fatty acids at the 
expense of saturated fatty acids’. The mechanism for regulating genes 
that are the targets of RORyt in the presence of the different types of 
fatty acids remains unclear, although it is possible that CD5L restricts 
cholesterol synthesis, which diminishes the endogenous source of RORyt 
ligands and thus the potential for pathogenicity. Fatty acids that are pro- 
duced by the microbiota might similarly modulate the activity of RORyt 
and therefore govern the balance between homeostatic and potentially 
pathogenic programs of gene expression in T\;17 cells. 

The microbiota are the most prominent influence from the environ- 
ment on the differentiation of T,;17 cells. In germ-free mice, T,17 cells 
are scarce in the lamina propria of the intestines as well as in the skin”** 
(Box 2). The number of T,;17 cells in the intestines varies widely between 


7 JULY 2016 | VOL 535 | NATURE | 77 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


BOX2 


The skin microbiota and 
adaptive immunity 


The microbiota influences the differentiation of adaptive immune 
cells both in the skin and the gut. Staphylococcus epidermidis, a 
commensal bacterium of the skin, potently induces T,,17 cells as 
well as T cells that express both IL-17A and the antigen CD8 (ref. 
134). Both cross-presenting dendritic cells that are dependent on 
basic leucine zipper transcription factor ATF-like 3 (Batf3) and cells 
derived from monocytes are required to induce a response from 
cells that express IL-17A and CD8 to S. epidermidis in the skin!**. 
On infection with the cutaneous pathogenic protozoa Leishmania 
major, local commensal bacteria are necessary to elicit protective 
immunity (which manifests as inflammation and necrosis), and 
monoassociation with S. epidermidis is sufficient to promote 

this response*’. Importantly, T417 cells in the skin are affected 

by the skin microbiota independently of the gut microbiota’, 
which suggests that T,,17 cells of the mucosa are regulated ina 
compartmentalized manner by local commensal bacteria. The 
production of IL-17A by T cells in the skin requires the expression 
of IL-1R but not IL-23R, which is in contrast to the requirements of 
T,17 cells in the intestines and is consistent with compartment- 
specific mechanisms for T-cell regulation*®. Although immunological 
cross-communication has been shown to occur between mucosal 
tissues such as the intestine and the lung!®° and the nasopharynx 
and the uterus’, there seems to be a compartment-specific 
regulation of immunity in the skin. This might be because the skin 
is faced with challenges from the environment that differ from those 
faced by mucosal sites and therefore requires distinct pathways to 
control its local immune responses. 


animal facilities, even in genetically identical mice that have been reared 
in specific pathogen-free conditions, and often reflects whether mice have 
been colonized with SFB' (Fig. 2). Such bacteria are potent modulators 
of the immune-cell functions of the host: as well as inducing T,,17 cells, 
they also stimulate IgA synthesis*“°” and fucosylation of the epithelium 
through the activation of group 3 innate lymphoid cells (ILC3s)**. SFB 
that are indigenous to mice and rats are genetically distinct host-specific 
members of the gut microbiota”. On their monocolonization of germ- 
free mice or rats, populations of SFB can expand in the gut lumen of either 
species; however, the bacteria bind to epithelial cells of the small intestine 
and induce T,,17 cells in a strictly host-specific manner™. The physical 
interaction of SFB with the gut epithelium is therefore probably essential 
for T,17-cell differentiation. The causality of the relationship between 
the adhesion of bacteria to the epithelium and the induction of T,,17 cells 
is further supported by analysis of T,,17-cell induction by the intestinal 
pathogenic bacteria Citrobacter rodentium and Escherichia coliO.157:H7 
(ref. 50). On monocolonization of mice, these species triggered T,,17-cell 
responses, whereas adhesion-defective mutants fail to do so. Moreover, 
20 strains of bacteria that were isolated from the faeces of a person with 
ulcerative colitis exhibit characteristics that enable their adhesion to epi- 
thelial cells and induction of T,;17 cells in the colons of mice”. 
Colonization with adherent SFB elicits a unique program of gene 
expression that includes the upregulation of two isoforms of the pro- 
tein serum amyloid A in the epithelial cells of the small intestine. This 
induction is largely restricted to the terminal ileum, the site at which SFB 
attach to the epithelium”. The genes that encode serum amyloid A are 
also induced when SFB and epithelial cell lines are cultured together in 
vitro”, which suggests that their direct interaction initiates a signalling 
pathway that results in gene expression. In parallel, SFB activate ILC3s to 
produce IL-22 through the intermediary expression of IL-23 by myeloid 


78 | NATURE | VOL 535 | 7 JULY 2016 


cells”! (Fig. 2). The expression of serum amyloid A in the epithelial cells of 
the small intestine is dependent on the secretion of IL-22 from ILC3s, by 
way of phosphorylation of signal transducer and activator of transcription 
3 (Stat3) in epithelial cells”. In vivo induction of serum amyloid A might 
therefore require both adhesion of SEB to epithelial cells and activation 
of the IL-22 receptor. 

Polarization of T,,17 cells that are specific to SFB occurs in the mesen- 
teric lymph nodes, in which RORytis upregulated before T cells migrate 
to the lamina propria™. T;17-cell polarization is dependent on monocyte- 
derived CX,CRI' cells rather than classic dendritic cells®, although a role 
for dendritic cells that express CD103 and CD11b and are dependent 
on Notch2 and IRF4 for their development has also been proposed”. 
Polarized T cells that express RORytand CD4 are distributed broadly 
throughout the intestine and are even found in the spleen, although most 
IL-17A expression is confined to the ileum, where serum amyloid A seems 
to act as an adjuvant and contributes to the induction of IL-17A” (Fig. 2). 

The mechanism through which serum amyloid A stimulates T};17 cells 
has yet to be resolved. In a feed-forward process, myeloid cells including 
those that carry CX,CRI1 can respond to serum amyloid A by producing 
cytokines that activate ILC3s, which promotes T,17-cell differentiation” 
(Box 1). Serum amyloid A might also stimulate T cells directly to enhance 
RORyt function and upregulate IL-17A expression”. Serum amyloid A is 
a carrier of both high-density lipoprotein and retinol”, and it can deliver 
these immunomodulatory molecules to antigen-presenting cells and 
T cells. The potential regulation of T,17-cell differentiation by lipids sug- 
gests that serum amyloid A might function unconventionally to modulate 
inflammatory functions in these cells. Together, these findings indicate 
that the differentiation of T,;17 cells directed by SFB is mediated through 
a complex circuitry of interactions between epithelial cells, dendritic cells 
and ILC3s to generate cells that are poised to acquire effector functions 
in the appropriate microenvironment (Fig. 2). Because SFB have not yet 
been definitively identified in the human intestine”, whether this circuitry 
applies more generally to microbiota-mediated T,,17-cell-induction in 
humans requires further investigation™. 


Intestinal T,,17 cells and autoimmunity 
Most of the T,;17 cells that are elicited by SFB have TCRs that specifically 
bind to antigens that are expressed by adhesive forms of these bacteria”. 
Two major antigens have been identified as being responsible for this 
induction. These antigens might be preferentially taken up by the cells 
of the host when SFB adhere to epithelial cells. Colonization with these 
bacteria, and the consequent induction of T,,17 cells with TCRs that 
are specific for SFB antigens, helps to protect the host from intestinal 
pathogenic species such as C. rodentium’. However, SFB-induced T,,17 
cells might promote pathogenesis in hosts that have a genetic predisposi- 
tion to autoimmune diseases. In the K/BxN mouse model of autoim- 
mune arthritis, colonization with commensal microbes is required for 
the development of disease”. Monocolonization with SFB enhances the 
production of autoantibodies and accelerates the progression of disease 
through the generation of T,;17 cells’, although a microbiota-induced 
T cell-dependent process can also precipitate disease. Mice that har- 
bour SFB are more susceptible to experimental autoimmune encepha- 
lomyelitis than are germ-free mice®’. By contrast, the presence of SFB 
is strongly correlated with a diabetes-free state in non-obese diabetic 
mice”. The influence of such bacteria on the development of autoim- 
mune diseases is therefore dependent on context. The conditions that 
determine whether intestinal T,,17 cells play a beneficial or harmful part 
in the host are not yet fully understood. Interestingly, germ-free mice that 
are colonized with SFB show a striking genotype-specific difference in the 
induction of T,;17 cells. For instance, BALB/c mice have fewer T,;17 cells 
but a greater amount and diversity of IgA in their faeces than do C57BL/6 
mice’, Therefore, a combination of genetics and the composition of the 
gut microbiota affects the status of the immune system and an individual's 
susceptibility to disease. 

In the K/BxN mouse model of autoimmune arthritis, self-reactive T,,17 
cells that express a transgenic TCR that is specific for a self antigen can 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


\ »~ 6 Lumen 
Epithelium-adhering - ‘\ 
bacterium SFB 
Mucus layer 
ry m I: 7 Tele I: I" 
F Serum i i 
| Antigens amyloid A IL-22 Sevan High salt Lamina propria 


Small intestine 


epithelial cell 
. CX,CRI* ILC3 
T 
/ 
ie 
y ») ——__> 
—% IL-17A 
Microbial expression 
antigen G 
Mesenteric 
—_> —— — ee ol 
lymph node —— © @ 


Naive T cell RORyt* Distribution throughout 


the gut and systemic 
lymphoid tissues 

Figure 2 | Microbiota-mediated induction of T,,17 cells and 
autoimmunity. Epithelium-adhering bacteria initiate the differentiation 
of naive CD4" T cells into RORyt-expressing T cells (T,17 polarized cells) 
(red) in the mesenteric lymph node through as-yet-undefined antigen- 
presenting cells (APCs). T,,17 polarized cells then accumulate and further 
differentiate into IL-17-expressing homeostatic T,,17 cells (green) in the 
lamina propria of the small intestine. These homeostatic T}17 cells then 
stimulate epithelial cells to enhance the integrity of the intestinal mucosal 
barrier. The adhesion of segmented filamentous bacteria (SFB) elicits a 
unique program of gene expression in the epithelial cells, including the 
upregulation of serum amyloid A. Serum amyloid A from epithelial cells 
of the small intestine seems to function as a cytokine and it modulates 
CX,CR1-expressing cells (that are derived from monocytes) to produce 


migrate out of the intestine and into the spleen™. Self-reactive but gut- 
microbiota-activated T,,17 cells might contribute to other autoimmune 
disorders, including uveitis” and encephalomyelitis®. Such T-cell-medi- 
ated autoimmune conditions could be caused by cross-reactivity between 
microbial peptides and self antigens”, a process known as molecular 
mimicry (Fig. 2). This model is consistent with the fact that the genes of 
the major histocompatibility complex (MHC) are the most important 
genetic susceptibility loci for many autoimmune disorders. Alternatively, 
microbiota-specific T,,17 cells might mediate some kind of bystander 
effect. This is because autoimmune disorders often affect more than one 
organ, and the genes that encode the signalling molecules that act down- 
stream of TCRs are important determinants of genetic susceptibility to 
various autoimmune disorders in humans, including rheumatoid arthri- 
tis®. The T-cell threshold model proposes that gut-microbiota-activated 
Ty17 cells might migrate into the draining lymph nodes of the target 
organs and either lower the threshold of activation of autoreactive T cells 
or have their own activation threshold lowered. Indeed, T,,17 cells that 
are specific to SFB and that are primed in gut-draining lymph nodes can 
be found in other lymph nodes and in the spleen”. When produced aber- 
rantly in some organs, molecules such as serum amyloid A might serve 
an adjuvant function and contribute to the heightened activity of such 
T cells (Fig. 2). 

The potential for detrimental inflammation suggests that the responses 
of T cells and B cells to the gut microbiota must be tightly regulated. This 
is achieved through a number of mechanisms, including T-cell depletion 
and anergy. In this context, expression of MHC class II molecules by ILC3s 
has been found to restrain the expansion of T,;17 cells®. This could occur 
through the presentation of antigens that are derived from commensal 
bacteria to induce apoptosis of the antigen-specific T cells”, although an 
antigen-presenting function for ILC3s is yet to be demonstrated. Beyond 
this context, however, one of the most crucial mechanisms for restraining 


amyloid A 


LCFAs 
Saturated fatty acids 
IL-23, IL-1B 


Draining lymph node 
in target organ 


Homeostatic | 
T,,17 cell pool 


J Self antigen 
f 
Se + @- 8-@ 


T-cell threshold model 
+ T-cell activation 
threshold lowered 
Pathogenic dir 


\ Siacall a i 


IL-23, which stimulates the production of IL-22 by ILC3s. As well as its 
effects on CX,CR1-expressing cells, serum amyloid A can stimulate RORyt- 
expressing T cells directly to upregulate the expression of IL-17A. Dendritic 
cells that express the antigens CD11band CD103 have also been implicated 
in the expansion and maintenance of T,17 cells (not shown). T,;17 cells 
become pathogenic when they are stimulated with IL-23, IL-1B, higher 
concentrations of salt, long-chain fatty acids (LCFAs) and saturated fatty 
acids. Pathogenic T,,17 cells can migrate to the draining lymph nodes of 
target organs, where they contribute to autoimmune disease through cross- 
reactivity between peptides from microbes and self antigens (the molecular 
mimicry model). Alternatively, microbiota-specific T,,17 cells migrate to the 
lymph nodes and lower the threshold of activation of auto-reactive T cells 
such as T cells (the T-cell threshold model). 


Molecular mimicry model 

* Cross-reactivity to self antigens 
and microbial peptides 

+ Autoimmune conditions 


inflammation in the gut is the induction of CD4* T,,, cells that express 
forkhead box protein P3 (Foxp3). 


Induction of T,,, cells by the microbiota 

Teg Cells that express both CD4 and Foxp3 can be found in every organ 
of the body, and they comprise a high proportion of the T cells of the 
lamina propria of the intestine*”””*. Intestinal T,,,, cells play an impor- 
tant part in maintaining immune tolerance to dietary antigens and the 
gut microbiota’”*”* as well as in suppressing tissue damage inflicted by 
immune responses against pathogenic bacteria such as C. rodentium’”® that 
are mediated by T., cells. The intestine contains both thymus-derived T. 
(tT...) cells and peripherally differentiated T.., (PT,..) Cells; pT... cells are 
substantially enriched in the colon, mainly express RORyt and generally 
lack the zinc-finger protein Helios and the receptor neuropilin 1 (Nrp1) 
(refs 77-79) (Fig. 3). Because pT,,.. cells disappear under germ-free condi- 
tions, they are probably induced by the microbiota”. Consistent with 
this, T,.. cells that express RORyt show the restricted TCR repertoire of 
cells that have proliferated in response to peripheral stimuli, but their TCR 
sequences overlap with those of CD4* T cells that lack Foxp3 (ref. 79). 
Experiments to track the fate of immature T cells that express a transgenic 
TCR cloned from colonic T,,, cells demonstrate that the expansion and 
differentiation of the transgenic T cells into T,.. cells occurs in the colon 
in the presence of cognate commensal bacteria and not in the thymus”. 
A considerable fraction of RORyt" T,.. cells express IL-10 (ref. 77), which 
is also produced by many other types of cell, including type 1 regulatory 
(Tr1) cells and myeloid cells, which have important roles in maintaining 
homeostasis in the intestines*””. Tyeg-Cell-derived IL-10 is essential for 
suppression of the aberrant activation of myeloid cells, yd T cells and 
Tyl7 cells? ®, Teg cells that express RORyt also express high levels of 
cytotoxic T-lymphocyte protein 4 (CTLA-4) (ref. 77) and are more effec- 
tive than RORyt-negative T,,, cells in restraining immune pathogenesis 


reg 


7 JULY 2016 | VOL 535 | NATURE | 79 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


Small : Colon Dietary 
intestine . B. theta fibre 
B. fragilis B. caccae 
eoE@P Ee 
: as ostridia clusters 
7 Clostridia cl & © 

Disa XIVa, IV, XVIII e Y 

antigens a 


Intestinal Lumen 
epithelial cell 


Ce cl 


Lamina propria 


GPR109a Self antigens and 


HDACi Foxp3* T,,,-cell pool Pe antigens 
PTicg tTheg 
— © rm —- 
i 
ee T cell IL-2 ws fu. 2 
at Helios Helios* 
Nrp1- Nrp1* 
nee -B, RA RORyt* ota" 
it; IL-10 GATA3 T, 
Helios~ 
Nrp1- 
RORyt 


Figure 3 | Influence of the microbiota and diet on subsets of regulatory 

T cells in the intestine. Foxp3-expressing T,,., cells in the intestine can be 
subdivided into at least three subsets on the basis of their expression of RORyt, 
GATA3, Helios and Nrp1. T,,, cells that express RORyt but not Nrp1 are 
induced at peripheral sites by antigens derived from the microbiota. Known as 
PT reg Cells, they are the main producers of IL-10, which suppresses the aberrant 
activation of myeloid cells, y5 T cells and T,;17 cells. Dendritic cells produce 
mediators of pT,,.-cell differentiation, including TGF-6 and retinoic acid 
(RA). Short-chain fatty acids (SCFAs), which are produced from dietary fibre 
by certain members of the microbiota, particularly species of Clostridia, also 
contribute to the induction of pT,,, cells. On binding to the G-protein-coupled 
receptor (GPR) 109a on dendritic cells, short-chain fatty acids induce the 
expression of aldehyde dehydrogenase (ALDH), which metabolizes vitamin A 
into RA. SCFAs entering dendritic cells act as inhibitors of histone deacetylase 
(HDACi) to suppress the expression of pro-inflammatory cytokines. They 


in models of colitis”. Conditional inactivation of RORyt using the Cre- 
Lox recombination system in Foxp3* intestinal T cells in mice results in 
T,,2-cell-mediated inflammation” or in the expansion of T,,17 cells”. It 
should be noted that some intestinal T,,17 cells lose IL-17A expression 
in the presence of SFB anda fraction of these ex-T,;17 cells express Foxp3 
(ref. 86). Foxp3* Cre-Lox mice in which RORyt has been inactivated 
might therefore reflect their RORyt deficiency in ex-T}17 cells as well as 
microbiota-induced pT, cells. 

The intestine also contains a subpopulation of T,.., cells that expresses 
the transcription factor GATA3 (ref. 87) (Fig. 3). These cells are distinct 
from RORyt* T,., cells, and most express Nrp1 and Helios and are unaf- 
fected by the absence of the gut microbiota, which suggests that they 
mainly derive from tT ,.g cells”*. Tyeg cells that express GATA3 co-express 
the IL-33 receptor ST2 (also known as ILIRL1) (ref. 88). IL-33, which 
is produced by the epithelial cells of the intestine at high levels under 
conditions of inflammation, works with IL-2 and the process of TCR 
engagement to induce the expression of GATA3 in T,,.. cells. GATA3 
upregulates the expression of Foxp3 and ST2 in a feed-forward process 
that promotes the proliferation and maintenance of T,.., cells®*. Tyeg Cells 
that express Foxp3 but that lack RORyt and Nrp1 constitute one-third of 
the T,..-cell population and are uniquely abundant in the lamina propria 
of the small intestine” (Fig. 3). This subpopulation is unaffected by the 
absence of the gut microbiota but disappears in germ-free mice that are 
fed an antigen-free diet®’. Such cells therefore seem to be pT ..Cells that are 
induced by dietary antigens, and they constitute a subpopulation that can 
be distinguished from microbiota-induced, RORyt'’ pT,,, cells and from 


reg 


80 | NATURE | VOL 535 | 7 JULY 2016 


also directly act on naive T cells through GPR43 or the upregulation of Foxp3 
expression through HDAC inhibition. IL-2 derived from T cells probably 
helps to stabilize the differentiation of T,,. cells. Several species of Bacteroides 
contribute to the induction of pT,,, cells that express RORyt but not Nrp1 
through dendritic cells. A second pool of pT,,., cells expresses neither RORyt 
nor Nrp1; these Veg cells are induced by, and maintain immune tolerance 

to, dietary antigens. It should be noted that induction of pT,,, cells through 
dietary antigens occurs largely in the small intestine, whereas the induction of 
PT eg cells by the microbiota occurs largely in the colon. T,., cells that express 
both GATA3 and Nrp1 are thought to be generated in the thymus and are 
known as tT,,. cells. GATA3* T,,. cells express ST2 (a component of the IL-33 
receptor that is also known as ILIRL1). IL-33, which is probably released from 
the epithelial cells of the intestine at steady state, is markedly upregulated 
under conditions of inflammation. IL-33 acts with IL-2 (from T cells) to 
induce the expression of GATA3 in T,,, cells. 


GATA3-expressing tT... cells. Mice that lack this subpopulation exhibit 
an increased susceptibility to food allergies. Certain pT g-cell and tT,,..- 
cell subpopulations might have complementary and context-dependent 
functions, such as immune regulation at steady state in response to com- 
ponents of the microbiota (by RORyt" T,., cells that lack Nrp1) and of the 
diet (by RORyt-negative T,,, cells that also lack Nrp1) and under condi- 
tions of inflammation that is triggered by self antigens (by GATA3* T,,.. 
cells that express Nrp1). 

The parts played by individual members or defined communities 
of the gut microbiota in the accumulation and functional maturation 
of T,., cells of the intestine are starting to be illuminated. For example, 
strains that fall within clusters IV, XIVa and XVIII of Clostridia have a 
strong capacity for inducing the accumulation of T,,. cells in the colon** 
(Fig. 3). Oral administration to germ-free mice ofa mixture of 46 strains 
of Clostridia that were derived from the faeces of conventional mice” 
leads to the strong induction of T,,, cells in the colon’. Similarly, a mix- 
ture of 17 strains of Clostridia that were isolated from a healthy person 
strongly induces T,,., cells in the colons of mice and rats*. This mixture 
preferentially enhances the accumulation of RORyt-expressing T,,, cells 
that lack Helios, rather than of GATA3-expressing T,, _cells* ”7 Strains 
of Clostridia can also facilitate the expression of IL- 10 and CTLA-4 by 
Treg cells*’, and mice with an abundance of strains of Clostridia in their 
intestines exhibit resistance to experimental colitis’. In mouse models of 
graft-versus-host disease, the introduction of 17 strains of T,,.-inducing 
Clostridia reduces severity of the disease”. These Clostridia also stimulate 
ILC3s to produce IL-22, which helps to reinforce the epithelial barrier 


© 2016 Macmillan Publishers Limited. All rights reserved. 


and reduces the permeability of the intestine to dietary proteins’’. Mice 
colonized by a microbiota that includes Clostridia therefore display a sup- 
pressed response to food allergens”. Clostridia-induced T,,, cells support 
the production of IgA in the intestine, which contributes to increased 
diversity of the microbiota and, in particular, of Clo stridia™. 

One species of Clostridia, Faecalibacterium prausnitzii, is underrep- 
resented in people with inflammatory bowel disease” and it promotes 
the accumulation of IL-10-expressing T cells that are positive for both 
CD4 and CD8aa in the colon”. A population of lymphocytes from the 
intestinal epithelium that is positive for both such antigens could havea 
similar immune regulatory role in the small intestine of the mouse. These 
microbiota-dependent T cells differentiate in the periphery on loss of the 
expression of the CD4-lineage transcription factor ThPOK and upregula- 
tion of the CD8-lineage transcription factor Runx3”*”°. How these cells 
function in preventing the differentiation of inflammatory cells in the 
small intestine is yet to be determined. 

A small consortium of microbes known as altered Schaedler flora, 
which contains strains of Clostridia, is also capable of increasing the num- 
ber of T,,, cells in the lamina propria of the mouse colon”’. The precise 
mechanism through which Clostridia stimulate the induction of T,,, cells 
in the colon remains to be elucidated. One possible mechanism is the 
cooperative production of short-chain fatty acids through fermentation 
of dietary fibre*”””* (Fig. 3). For example, the collective genomes of the 17 
strains of T,..-cell-inducing Clostridia contain numerous genes that are 
predicted to be involved in the biosynthesis of short-chain fatty acids”*. 
Short-chain fatty acids suppress the expression of pro-inflammatory 
cytokines in dendritic cells through the inhibition of histone deacetylases 
(HDACs)” and through the activation of the G-protein-coupled receptor 
(GPR) 109a (also known as HCAR2) (ref. 97). They can also stimulate the 
proliferation of T,,, cells directly by activating GPR43 (FFAR2) (ref. 38) 
and the differentiation of naive CD4" T cells into pT,,, cells through 
HDAC inhibition, which results in histone H3 acetylation at the con- 
served non-coding sequence (CNS)1 element of the gene Foxp3 (ref. 39). 
Invitro stimulation of T,,, cells with short-chain fatty acids upregulates 
the expression of GPR15 (ref. 38), which promotes the recruitment of 
Teg Cells to the colon”, although this has not been demonstrated in vivo. 

Programs for the induction of T,., cells can also be activated by non- 
Clostridia members of the microbiota. Lactobacillus reuteri and L. muri- 
nus have been shown to increase the proportion of T,,, cells in mice*'°. 
Infection with Helicobacter hepaticus induces IL-10-producing T,,., cells 
that inhibit the development of colitis in an H. hepaticus antigen-spe- 
cific manner™’. Bacteroides fragilis boosts the production of IL-10 by 
T,., cells of the colon, and this activity is mediated by polysaccharide 
A” from the bacterium’s capsule. Outer-membrane vesicles containing 
polysaccharide A that are released by B. fragilis might also be taken up 
by dendritic cells of the intestine to stimulate their production of IL-10 
through TLR-2 signalling”. The IL-10 from these dendritic cells might 
then induce T,,,,, cells to also produce IL-10. Several other species of Bac- 
teroides, including B. caccae and B. thetaiotaomicron, also induce the 
accumulation of Foxp3* cells, particularly RORyt-expressing pT,,., cells 
in the colon’”*"”’. Collectively, there is considerable overlap between the 
responses of T,.. cells to Clostridia, Lactobacillus and Bacteroides, which 
indicates that different pathways for the regulation of T,.. cells converge 
in the intestine. The induction and maintenance of T,,, cells might be a 
common and crucial mechanism for maintaining the homeostatic and 
beneficial relationship between the microbiota and the host. 

It has been suggested that tolerogenic dendritic cells that carry the 
CD103 antigen contribute to the induction of T,., cells'*", CSF2 that 
is produced in response to the microbiota by ILC3s might also act on 
dendritic cells of the colon to promote the expansion of T,,.. cells'*° (Box 1) 
(Fig. 3). The ablation of MHC class II expression in conventional dendritic 
cells, which include CD103* dendritic cells, results in reduced induction 
of pT,,., cells and in spontaneous inflammation’”’. The TCR repertoires 
of pT,,. cells and tT... cells differ substantially. In one study, at least half 
of the TCRs that were cloned from T,,., cells of the colon and expressed 
in a reporter hybridoma cell line responded to autoclaved contents of 


reg 


REVIEW 


the intestines of mice, and two TCR clones were stimulated by strains of 
Parabacteroides distasonis or by an uncharacterized species of Clostridia®. 
Consistent with this, at least some of the T,,, cells that were induced by the 
mixture of 17 strains of human-derived Clostridia reacted to Clostridia 
antigens’. Whether there isa role for antigen specificity in T,..-cell-medi- 
ated tolerance at mucosal surfaces, however, is an important question that 
still needs to be answered. 

Cells from the adaptive immune system that are primed at microbiota- 
sensing mucosa can take up residence in and protect other mucosal 
surfaces. For example, intranasal vaccination is particularly effective 
at eliciting protective memory-T-cell responses against Chlamydia 
trachomatis in the female reproductive tract!**. However, when 
ultraviolet-inactivated C. trachomatis is delivered intramucosally, antigens 
accumulate preferentially in CD103-expressing dendritic cells that lack 
CD11b and induce antigen-specific T,,, cells, and no protective immu- 
nity is elicited'®*. By contrast, immunization with ultraviolet-inactivated 
C. trachomatis conjugated to adjuvant nanoparticles that target CD103- 
lacking dendritic cells that express CD11b provides effective, antigen- 
specific memory responses by mucosa-resident T,;1 cells'”’. The immune 
responses that mucosal bacteria elicit therefore differ according to the 
route and context of antigen delivery. Elucidation of the mechanisms 
by which commensal microbes deliver antigens for presentation in an 
immunogenic versus a tolerogenic context might enable the development 
of effective mucosal vaccines. 


Implications for disease and therapeutics 

Members of the gut microbiota have distinct effects on homeostasis of 
the host’s adaptive immune system. Differences in the composition of 
the community therefore contribute to variability in immune responses 
and susceptibility to infection, autoimmune disorders, allergy and other 
immunological conditions. Understanding the development of the 
mucosal immune system and its dysregulation in relation to normal and 
dysbiotic microbiotas is important for the development of drugs, probiotic 
supplements, vaccines and cancer immunotherapies. 


A crucial window of time 

The microbiota is established in early life. Indeed, an absence of 
microbiota during this period of development leads to increases in the 
number of invariant natural killer T (iNKT) cells and in susceptibility 
to colitis and asthma in animal models. Early exposure to the gut 
microbiota suppresses the abundance of iNKT cells in the gut and lung, 
partly through the epigenetic suppression of the gene that encodes the 
chemokine CXCL16 (ref. 109). Colonization with commensal bacte- 
ria during the neonatal period also results in the recruitment of T,,. 
cells to mucosal sites and the establishment of long-lasting tolerance 
to the microbes’. Treatment with antibiotics results in an increase 
in susceptibility to asthma in perinatal, but not adult, mice through a 
decrease in the accumulation of T,,, cells in the colon and an enhanced 
IgE response’. In the absence of colonization by a microbiota at an 
early age, B cells preferentially undergo isotype switching to IgE, rather 
than IgA""”, The elevated concentration of IgE in the blood serum of 
germ-free mice is accompanied by an increase in circulating basophils 
and exaggerated basophil-mediated T,,2-cell responses and allergic 
inflammation’. The induction of IgE is not suppressed by colonization 
with a microbiota later in life or by early colonization with a microbiota 
of limited complexity’. 

A cohort of children who had a high risk of developing atopy and 
asthma were found to have microbiotal dysbiosis that is characterized 
by a reduction in four specific genera of bacteria: Faecalibacterium, 
Lachnospira, Veillonella and Rothia — collectively known as FLVR™™. 
Colonization of mice with FLVR mitigated airway inflammation in a 
model of allergic asthma, which raises the prospect that atopy or asthma 
could be averted by early therapy to correct dysbiosis’. There is a cru- 
cial window of time in early life, therefore, during which exposure to 
diverse microbiota is extremely important for the suppression of iNKT 
cells and IgE-expressing cells, the induction and expansion of T,., cells 


7 JULY 2016 | VOL 535 | NATURE | 81 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


and the establishment of systemic tolerance to a large spectrum of envi- 
ronmental antigens. 


Dysbiosis and aberrant adaptive immunity 

Microbiotal dysbiosis can be caused by genetic predisposition, infections 
and changes in diet and nutritional status, as well as the use of antibiotics, 
agents that suppress gastric acid and anticancer drugs. Although there 
is convincing evidence to suggest that dysbiosis causes or promotes dis- 
ease, the underlying mechanisms are not fully understood. Several reports 
describe the association of particular species of bacteria with autoimmune 
and inflammatory conditions'*""°. In a mouse model, the administra- 
tion of a diet rich in milk fat induces a bloom of taurocholic-acid-con- 
suming Bilophila wadsworthia, which enhances the response of Ty1 cells 
and accelerates the onset of colitis'’°. Adherent-invasive E. coli (AIEC) 
is frequently observed in people with inflammatory bowel disease and 
can induce an active response by T,17 cells in mice’'®. Mutations in the 
gene NOD2, found in subsets of people with inflammatory bowel disease, 
are associated with shifts in the composition of the gut microbiota’. 
Nod2 deficiency in mice results in the expansion of the commensal bac- 
terium Bacteroides vulgatus, which is accompanied by an excessive IFN-y 
response from intraepithelial lymphocytes’ *. Colonization of the intes- 
tinal mucosa by bacteria from the mouth, such as Veillonellaceae and 
Fusobacteriaceae, is one of the earliest events in children with new-onset 
Crohn's disease’. Similarly, there is an increased prevalence of Prevo- 
tella copri in the faecal microbiota of people with new-onset rheumatoid 
arthritis”. However, the ability of these bacteria to trigger disease is yet 
to be established. 

As well as the activation of T,; cells in response to potentially patho- 
genic bacteria, compromised barrier function of the epithelium and dys- 
regulated responses to the commensal microbiota are important features 
of chronic inflammatory diseases that are associated with dysbiosis. For 
instance, infection with HIV leads to chronic dysbiosis with a reduction 
in Clostridia and Bacteroidia and an enrichment of taxa that produce 
enzymes for tryptophan catabolism, and is accompanied by heightened 
permeability of the mucosa, elevated levels of T-cell activation and dimin- 
ished numbers of IL-17-secreting mucosal T cells'”". These events might 
contribute collectively to the chronic inflammation that is observed in 
individuals who are infected with HIV. In a mouse model, infection with 
the protozoa Toxoplasma gondii and subsequent disruption of the epithe- 
lial barrier induces memory Tj] cells specific for commensal Clostridia 
that normally induce T,,, cells and IgA-secreting B cells'”’”*. Similarly, 
responses to flagellin antigens (known as CBir) that are expressed by 
commensal species from the Clostridia cluster XIVa have been detected 
in people with Crohn's disease’. Importantly, the transfer of CD4- 
expressing T cells that are specific for CBir into immunodeficient mice 
that have been colonized with commensal Clostridia causes severe coli- 
tis“. Disruption of the epithelial barrier owing to the complex interplay 
between a dysbiotic microbiota and pathogenic bacteria might therefore 
lead to dysregulated immune responses to commensal microbes, chronic 
inflammation and the stabilization of a pro-inflammatory community 
of microbes. 


Cancer immunotherapy 

The importance of the composition of the microbiota in how tumour-car- 
rying hosts respond to chemotherapy or checkpoint blockade immuno- 
therapy has been highlighted in several studies. Reductions in the growth 
of sarcomas in mice following treatment with the chemotherapeutic drug 
cyclophosphamide can be compromised after exposure to antibiotics, and 
this has been attributed to the loss of anti-tumour T,,17-cell-inducing 
commensal bacteria, the growth of which is favoured by the chemo- 
therapy”. However, it is unknown whether the beneficial anti-tumour 
properties of microbiota-dependent T,,17 cells are broadly applicable. 
Similarly, antibiotics compromise the anti-tumour response that follows 
CTLA-4 blockade in mice’”’. In this case, anti-CTLA-4 immunotherapy 
favours the dominance of species of Bacteroides, such as B. fragilis and 
B. thetaiotaomicron, in both mice and people. These bacteria are of benefit 


82 | NATURE | VOL 535 | 7 JULY 2016 


because they enhance the efficacy of the CTLA-4 blockade, possibly 
through an anti-tumour response mediated by Ty1 cells”. In another 
mouse model, colonization of the gut with Bifidobacteria has been found 
to contribute to the control of implanted syngeneic tumours by CD8- 
expressing T cells following anti-PD-L1 cancer immunotherapy’. The 
mechanism for the improved anti-tumour response might involve activa- 
tion of the functions of antigen-presenting cells, followed by improved 
infiltration of tumours by cytotoxic T cells, although it remains to be 
determined whether microbiota-regulated CD4" T cells also have a role 
in restraining the growth of tumours. 


Outlook 

Studies of how the mutualistic relationship between cells of the adaptive 
immune system and members of the microbiota affect health and dis- 
ease are in their infancy. Most efforts have strived to establish reduction- 
ist approaches that can be exploited to elucidate cellular and molecular 
mechanisms. From a translational perspective, models of humanized 
microbiota in germ-free mice and pigs have been established”. It is pos- 
sible that these efforts will permit the design of bacterial consortia and 
metabolic products that durably activate or suppress specific programs 
of adaptive immunity, which will result in the development of improved 
vaccines and therapeutic drugs for disorders that involve the immune 
system — including infections, autoimmunity, allergies and cancer. It 
should be noted, however, that the interactions between the microbiota 
and the host are influenced to a large extent by host genetics, coopera- 
tion and competition between pathogenic and commensal microbes 
and multiple environmental variables, including diet, circadian factors 
and the climate. The ‘one microbe, one response’ approach will probably 
need to be supplanted by more integrative systems analyses that require 
the development of advanced technologies and computational tools. 
Improved characterization of metabolites or other microbial effectors, 
coupled with computational pathway analyses, might enable the design 
of synthetic organisms or postbiotic products that can shape immune 
responses. Elucidation of the role of viruses and phages might provide 
further approaches for targeting components of the microbiota or host 
cells for therapeutic purposes. The role of the microbiota in shaping adap- 
tive immunity should therefore become an increasingly fertile area for 
basic and translational investigation. m 


Received 21 February; accepted 25 April 2016. 


1. Ivanov, |. |. et al. Induction of intestinal Th17 cells by segmented filamentous 
bacteria. Cell 139, 485-498 (2009). 
Together with ref. 50, this study shows that a subset of the microbiota 
specifically affects the accumulation of T,,17 cells in the intestine. 
2. Wu, H.-J. et al. Gut-residing segmented filamentous bacteria drive autoimmune 
arthritis via T helper 17 cells. Immunity 32, 815-827 (2010). 
3. Sivan, A. et al. Commensal Bifidobacterium promotes antitumor immunity and 
facilitates anti-PD-L1 efficacy. Science 350, 1084-1089 (2015). 
Together with refs 125 and 126, this study shows that a subset of the 
microbiota can have an effect on the efficacy of cancer therapy. 
4. Atarashi, K. et a/. T,.. induction by a rationally selected mixture of Clostridia 
strains from the human microbiota. Nature 500, 232-236 (2013). 
This study and ref. 5 show that a rationally selected consortium of bacteria can 
specifically induce T,,, cells in the intestine that function in systemic immune 
regulation. 
5. Atarashi, K. et al. Induction of colonic regulatory T cells by indigenous Clostridium 
species. Science 331, 337-341 (2011). 
6. Kau, A.L. etal. Functional characterization of lgA-targeted bacterial taxa from 
undernourished Malawian children that produce diet-dependent enteropathy. 
Sci. Transl. Med. 7, 276ra24 (2015). 
Together with refs 7 and 8, this study shows that IgA-SEQ is a powerful 
technique for identifying taxa that provide a strong stimulus to the host’s 
immune system. 
7. Palm, N. W. et al. Immunoglobulin A coating identifies colitogenic bacteria in 
inflammatory bowel disease. Ce// 158, 1000-1010 (2014). 
8. Bunker, J. J. et al. Innate and adaptive humoral responses coat distinct 
commensal bacteria with immunoglobulin A. /mmunity 43, 541-553 (2015). 
9. Beura, L. K. et al. Normalizing the environment recapitulates adult human 
immune traits in laboratory mice. Nature 532, 512-516 (2016). 
10. Roche, A. M., Richard, A. L., Rahkola, J. T., Janoff, E. N. & Weiser, J. N. Antibody 
blocks acquisition of bacterial colonization through agglutination. Mucosal 
Immunol. 8, 176-185 (2015). 
11. Pabst, O. New concepts in the generation and functions of IgA. Nature Rev. 
Immunol. 12, 821-832 (2012). 


reg 


© 2016 Macmillan Publishers Limited. All rights reserved. 


45. 


46. 


. lvanov, |. |. et al. The orphan nuclear receptor RORy 


. Harbour, S.N., 


. Jain, R. et al. Interleukin-23-induced 
. Wu, C. et al. Induction of pathogenic 


. Kleinewietfeld, 


. Arpaia, N. et al. 


. Peterson, D. A., McNulty, N. P., Guruge, J. L. & Gordon, J. |. IgA response to 


symbiotic bacteria as a mediator of gut homeostasis. Ce/! Host Microbe 2, 
328-339 (2007). 


. Cullender, T. C. et al. Innate and adaptive immunity interact to quench 


microbiome flagellar motility in the gut. Cel! Host Microbe 14, 571-581 (2013). 


. Kawamoto, S. et al. Foxp3* T cells regulate immunoglobulin A selection 


and facilitate diversification of bacterial species responsible for immune 
homeostasis. Immunity 41, 152-165 (2014). 


. Friman, V., Nowrouzian, F., Adlerberth, |. & Wold, A. E. Increased frequency of 


intestinal Escherichia coli carrying genes for S fimbriae and haemolysin in IgA- 
deficient individuals. Microb. Pathog. 32, 35-42 (2002). 


. Wei, M. et al. Mice carrying a knock-in mutation of Aicda resulting in a defect 


in somatic hypermutation have impaired gut homeostasis and compromised 
mucosal defense. Nature Immunol. 12, 264-270 (2011). 


. Moon, C. et a/. Vertically transmitted faecal IgA levels determine extra- 


chromosomal phenotypic variation. Nature 521, 90-93 (2015). 


. Kubinak, J. L. et al. MyD88 signaling in T cells directs lIgA-mediated control of the 


microbiota to promote health. Cell Host Microbe 17, 153-163 (2015). 


. Hirota, K. et al. Plasticity of T,17 cells in Peyer’s patches is responsible for the 


induction of T cell-dependent IgA responses. Nature Immunol. 14, 372-379 
(2013). 


. Hapfelmeier, S. et al. Reversible microbial colonization of germ-free mice reveals 


the dynamics of IgA immune responses. Science 328, 1705-1709 (2010). 


. Lindner, C. et al. Diversification of memory B cells drives the continuous 


adaptation of secretory antibodies to gut microbiota. Nature Immunol. 16, 
880-888 (2015). 


directs the differentiation 
program of proinflammatory IL-17* T helper cells. Cel/ 126, 1121-1133 (2006). 


. lvanov, |. |. et al. Specific microbiota direct the differentiation of IL-1 7-producing 


T-helper cells in the mucosa of the small intestine. Ce// Host Microbe 4, 337-349 
(2008). 


. Atarashi, K. et a/. ATP drives lamina propria T,,17 cell differentiation. Nature 455, 


808-812 (2008). 


. Weaver, C. T., Elson, C. O., Fouser, L. A. & Kolls, J. K. The Th17 pathway and 


inflammatory diseases of the intestines, lungs, and skin. Annu. Rev. Pathol. 8, 
477-512 (2013). 


. Puel, A. et al. Chronic mucocutaneous candidiasis in humans with inborn errors 


of interleukin-17 immunity. Science 332, 65-68 (2011). 


. Okada, S. et al. Impairment of immunity to Candida and Mycobacterium in 


humans with bi-allelic RORC mutations. Science 349, 606-613 (2015). 


. Ishigame, H. et al. Differential roles of interleukin-17A and -17F in host defense 


against mucoepithelial bacterial infection and allergic responses. /mmunity 30, 
108-119 (2009). 


. McGeachy, M. J. et al. The interleukin 23 receptor is essential for the terminal 


differentiation of interleukin 17-producing effector T helper cells in vivo. Nature 
Immunol. 10, 314-324 (2009). 


. Coccia, M. etal. IL-1B mediates chronic intestinal inflammation by promoting 


the accumulation of IL-17A secreting innate lymphoid cells and CD4* Th17 
cells. J. Exp. Med. 209, 1595-1609 (2012). 


. Hirota, K. et al. Fate mapping of IL-17-producing T cells in inflammatory 


responses. Nature Immunol. 12, 255-263 (2011). 


. El-Behi, M. et al. The encephalitogenicity of T,,17 cells is dependent on IL-1- 


and IL-23-induced production of the cytokine GM-CSF. Nature Immunol. 12, 
568-575 (2011). 

aynard, C. L., Zindl, C. L., Schoeb, T. R. & Weaver, C. T. Th17 cells 
give rise to Th1 cells that are required for the pathogenesis of colitis. Proc. Nat! 
Acad. Sci, USA 112, 7061-7066 (2015) 


. Ahern, P. P. et al. Interleukin-23 drives intestinal inflammation through direct 


activity on T cells. Immunity 33, 279-288 (2010). 

ranscription factor Blimp-1 promotes 
pathogenicity of T helper 17 cells. Immunity 44, 131-142 (2016). 

T,17 cells by inducible salt-sensing kinase 
SGK1. Nature 496, 513-517 (2013). 
. etal. Sodium chloride drives autoimmune disease by the 
induction of pathogenic T,,17 cells. Nature 496, 518-522 (2013). 


. Smith, P.M. etal. The microbial metabolites, short-chain fatty acids, regulate 


colonic T,.. cell homeostasis. Science 341, 569-573 (2013). 
Together with refs 39-41, this study identified short-chain fatty acids as strong 


inducers of T,., cells in the colon. 


. Furusawa, Y. et al. Commensal microbe-derived butyrate induces the 


differentiation of colonic regulatory T cells. Nature 504, 446-450 (2013). 
etabolites produced by commensal bacteria promote 
peripheral regulatory T-cell generation. Nature 504, 451-455 (2013). 


. Haghikia, A. et al. Dietary fatty acids directly impact central nervous system 


autoimmunity via the small intestine. /mmunity 43, 817-829 (2015). 


. Berod, L. et al. De novo fatty acid synthesis controls the fate between regulatory 


TandT helper 17 cells. Nature Med. 20, 1327-1333 (2014). 


. Santori, F.R. etal. Identification of natural RORy ligands that regulate the 


development of lymphoid cells. Cel/ Metab. 21, 286-297 (2015). 


. Wang, C. et al. CD5L/AIM regulates lipid biosynthesis and restrains Th17 cell 


pathogenicity. Cel! 163, 1413-1427 (2015). 

Naik, S. et al. Compartmentalized control of skin immunity by resident 
commensals. Science 337, 1115-1119 (2012). 

Umesaki, Y., Setoyama, H., Matsumoto, S., Imaoka, A. & Itoh, K. Differential 
roles of segmented filamentous bacteria and clostridia in development of the 
intestinal immune system. Infect. /mmun. 67, 3504-3511 (1999). 


47. 


48. 


49. 


50. 


51. 


52. 


53. 


54. 


55. 


56. 


57. 


58. 


59. 


60. 


61. 


62. 


63. 


64. 


78. 


REVIEW 


Lécuyer, E. et al. Segmented filamentous bacterium uses secondary and tertiary 
lymphoid tissues to induce gut IgA and specific T helper 17 cell responses. 
Immunity 40, 608-620 (2014). 

Goto, Y. et al. Innate lymphoid cells regulate intestinal epithelial cell 
glycosylation. Science 345, 1254009 (2014). 

Prakash, T. et a/. Complete genome sequences of rat and mouse segmented 
filamentous bacteria, a potent inducer of Th17 cell differentiation. Cell Host 
Microbe 10, 273-284 (2011). 

Atarashi, K. et al. Th17 cell induction by adhesion of microbes to intestinal 
epithelial cells. Ce// 163, 367-380 (2015). 

Together with ref. 51, this study shows that the response of intestinal T,,17 
cells is directed towards commensal and pathogenic bacteria that activate 
epithelial cells. 

Sano, T. et a/. An IL-23R/IL-22 circuit regulates epithelial serum amyloid A to 
promote local effector Th17 responses. Cell 163, 381-393 (2015); erratum 
164, 324 (2016). 

Schnupf, P. et al. Growth and host interaction of mouse segmented filamentous 
bacteria in vitro. Nature 520, 99-103 (2015). 

Panea, C. et al. Intestinal monocyte-derived macrophages control commensal- 
specific Th17 responses. Cell Rep. 12, 1314-1324 (2015). 

Lewis, K. L. et al. Notch2 receptor signaling controls functional differentiation of 
dendritic cells in the spleen and intestine. Immunity 35, 780-791 (2011). 
Persson, E. K. et al. IRF4 transcription-factor-dependent CD103*CD11b* 
dendritic cells drive mucosal T helper 17 cell differentiation. /mmunity 38, 
958-969 (2013). 

Schlitzer, A. et al. IRF4 transcription factor-dependent CD11b* dendritic cells 

in human and mouse control mucosal IL-17 cytokine responses. Immunity 38, 
970-983 (2013). 

Derebe, M. G. et al. Serum amyloid A is a retinol binding protein that transports 
retinol during bacterial infection. eLife 3, e03206 (2014). 

Sczesnak, A. et al. The genome of Th17 cell-inducing segmented filamentous 
bacteria reveals extensive auxotrophy and adaptations to the intestinal 
environment. Cell Host Microbe 10, 260-272 (2011). 

Yang, Y. et al. Focused specificity of intestinal T,,17 cells towards commensal 
bacterial antigens. Nature 510, 152-156 (2014). 

This study and ref. 122 show that different constituents of the microbiota 
guide distinct pathways of T-cell differentiation that is specific for the antigens 
of commensal bacteria. 

Block, K. E., Zheng, Z., Dent, A. L., Kee, B. L. & Huang, H. Gut microbiota regulates 
K/BXN autoimmune arthritis through follicular helper T but not Th17 cells. 

J. Immunol. 196, 1550-1557 (2016). 

Lee, Y. K., Menezes, J. S., Umesaki, Y. & Mazmanian, S. K. Proinflammatory 
T-cell responses to gut microbiota promote experimental autoimmune 
encephalomyelitis. Proc. Natl Acad. Sci. USA 108 (suppl. 1), 4615-4622 (2011). 
Kriegel, M. A. et al. Naturally transmitted segmented filamentous bacteria 
segregate with diabetes protection in nonobese diabetic mice. Proc. Natl Acad. 
Sci. USA 108, 11548-11553 (2011). 

Fransen, F. et al. BALB/c and C57BL/6 mice differ in polyreactive IgA 
abundance, which impacts the generation of antigen-specific IgA and 
microbiota diversity. Immunity 43, 527-540 (2015). 

Morton, A. M. et al. Endoscopic photoconversion reveals unexpectedly broad 
leukocyte trafficking to and from the gut. Proc. Nat! Acad. Sci. USA 111, 
6696-6701 (2014). 


. Horai, R. et al. Microbiota-dependent activation of an autoreactive T cell receptor 


provokes autoimmunity in an immunologically privileged site. Immunity 43, 
343-353 (2015). 


. Berer, K. et al. Commensal microbiota and myelin autoantigen cooperate to 


trigger autoimmune demyelination. Nature 479, 538-541 (2011). 


. Harkiolaki, M. et al. T cell-mediated autoimmune disease due to low-affinity 


crossreactivity to common microbial peptides. /mmunity 30, 348-357 (2009). 


. Sakaguchi, N. et al. Altered thymic T-cell selection due to a mutation of the ZAP- 


70 gene causes autoimmune arthritis in mice. Nature 426, 454-460 (2003). 


. Hepworth, M. R. et a/. Group 3 innate lymphoid cells mediate intestinal selection 


of commensal bacteria-specific CD4* T cells. Science 348, 1031-1035 (2015). 


. Round, J. L. & Mazmanian, S. K. Inducible Foxp3* regulatory T-cell development 


by acommensal bacterium of the intestinal microbiota. Proc. Natl Acad. Sci. USA 
107, 12204-12209 (2010). 


. Geuking, M. B. et al. Intestinal bacterial colonization induces mutualistic 


regulatory T cell responses. /mmunity 34, 794-806 (2011). 


. Weiss, J. M. et a/. Neuropilin 1 is expressed on thymus-derived natural regulatory 


T cells, but not mucosa-generated induced Foxp3* T reg cells. J. Exp. Med. 209, 
1723-1742 (2012). 


. Stefka, A. T. et al. Commensal bacteria protect against food allergen sensitization. 


Proc. Nat! Acad. Sci. USA 111, 13145-13150 (2014). 


. Bilate, A. M. & Lafaille, J. J. Induced CD4*Foxp3* regulatory T cells in immune 


tolerance. Annu. Rev. Immunol. 30, 733-758 (2012). 


. Josefowicz, S. Z., Lu, L. F. & Rudensky, A. Y. Regulatory T cells: mechanisms of 


differentiation and function. Annu. Rev. Immunol. 30, 531-564 (2012). 


. Kim, S. V. et al. GPR15-mediated homing controls immune homeostasis in the 


large intestine mucosa. Science 340, 1456-1459 (2013). 


. Ohnmacht, C. et a/. The microbiota regulates type 2 immunity through RORyt* T 


cells. Science 349, 989-993 (2015). 

Together with refs 78 and 79, this study shows that a subset of T,,, cells in 
the intestine express RORyt and that their development is affected by the 
microbiota. 

Sefik, E. et al. Individual intestinal symbionts induce a distinct population of 


7 JULY 2016 | VOL 535 | NATURE | 83 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


RORy* regulatory T cells. Science 349, 993-997 (2015). 

79. Yang, B. H. et al. Foxp3 T cells expressing RORyt represent a stable regulatory 
T-cell effector lineage with enhanced suppressive capacity during intestinal 
inflammation. Mucosal Immunol. 9, 444-457 (2016). 

80. Lathrop, S. K. etal. Peripheral education of the immune system by colonic 
commensal microbiota. Nature 478, 250-254 (2011). 

81. Roers, A. et al. T cell-specific inactivation of the interleukin 10 gene in 
mice results in enhanced T cell responses but normal innate responses to 
lipopolysaccharide or skin irritation. J. Exp. Med. 200, 1289-1297 (2004). 

82. Krause, P. et a/. IL-10-producing intestinal macrophages prevent excessive 
antibacterial innate immunity by limiting IL-23 synthesis. Nature Commun. 6, 
7055 (2015). 

83. Rubtsov, Y. P. et al. Regulatory T cell-derived interleukin-10 limits inflammation 
at environmental interfaces. Immunity 28, 546-558 (2008). 

84. Huber, S. et al. Th17 cells express interleukin-10 receptor and are controlled 
by Foxp3” and Foxp3* regulatory CD4* T cells in an interleukin-10-dependent 
manner. Immunity 34, 554-565 (2011). 

85. Park, S. G. et al. T regulatory cells maintain intestinal homeostasis by 
suppressing v6 T cells. /mmunity 33, 791-803 (2010). 

86. Gagliani, N. et al. T,17 cells transdifferentiate into regulatory T cells during 
resolution of inflammation. Nature 523, 221-225 (2015). 

87. Wobhlfert, E. A. et al. GATA3 controls Foxp3* regulatory T cell fate during 
inflammation in mice. J. Clin. Invest. 121, 4503-4515 (2011). 

88. Schiering, C. et al. The alarmin IL-33 promotes regulatory T-cell function in the 
intestine. Nature 513, 564-568 (2014). 

89. Kim, K.S. etal. Dietary antigens limit mucosal immunity by inducing regulatory 
T cells in the small intestine. Science 351, 858-863 (2016). 

90. Itoh, K. & Mitsuoka, T. Characterization of Clostridia isolated from faeces of 
limited flora mice and their effect on caecal size when associated with germ-free 
mice. Lab. Anim. 19, 111-118 (1985). 

91. Mathewson, N. D. et a/. Gut microbiome-derived metabolites modulate intestinal 
epithelial cell damage and mitigate graft-versus-host disease. Nature Immunol. 
17, 505-513 (2016). 

92. Sokol, H. et al. Faecalibacterium prausnitzii is an anti-inflammatory commensal 
bacterium identified by gut microbiota analysis of Crohn disease patients. Proc. 
Nat! Acad. Sci. USA 105, 16731-16736 (2008). 

93. Sarrabayrouse, G. et al. CDACD8aa lymphocytes, a novel human regulatory 
T cell subset induced by colonic bacteria and deficient in patients with 
inflammatory bowel disease. PLoS Biol. 12, €1001833 (2014). 

94. Reis, B.S., Rogoz, A., Costa-Pinto, F.A., Taniuchi, |. & Mucida, D. Mutual 
expression of the transcription factors Runx3 and ThPOK regulates intestinal 
CD4* T cell immunity. Nature /mmunol. 14, 271-280 (2013). 

95. Mucida, D. et al. Transcriptional reprogramming of mature CD4* helper T cells 
generates distinct MHC class Il-restricted cytotoxic T lymphocytes. Nature 
Immunol. 14, 281-289 (2013). 

96. Narushima, S. et al. Characterization of the 17 strains of regulatory T cell- 
inducing human-derived Clostridia. Gut Microbes 5, 333-339 (2014). 

97. Singh, N. etal. Activation of Gpr109a, receptor for niacin and the commensal 
metabolite butyrate, suppresses colonic inflammation and carcinogenesis. 
Immunity 40, 128-139 (2014). 

98. Di Giacinto, C., Marinaro, M., Sanchez, M., Strober, W. & Boirivant, M. Probiotics 
ameliorate recurrent Th1-mediated murine colitis by inducing IL-10 and IL-10- 
dependent TGF-f-bearing regulatory cells. J. Immunol. 174, 3237-3246 (2005). 

99. Karimi, K., Inman, M. D., Bienenstock, J. & Forsythe, P. Lactobacillus reuteri- 
induced regulatory T cells protect against an allergic airway response in mice. 
Am. J. Respir. Crit. Care Med. 179, 186-193 (2009). 

100. Tang, C. et al. Inhibition of Dectin-1 signaling ameliorates colitis by inducing 
Lactobacillus-mediated regulatory T cell expansion in the intestine. Ce// Host 
Microbe 18, 183-197 (2015). 

101.Kullberg, M. C. et a/. Bacteria-triggered CD4* T regulatory cells suppress 
Helicobacter hepaticus-induced colitis. J. Exp. Med. 196, 505-515 (2002). 

102.Shen, Y. et al. Outer membrane vesicles of a human commensal mediate 
immune regulation and disease protection. Cell Host Microbe 12, 509-520 
(2012). 

103. Faith, J. J., Ahern, P. P,, Ridaura, V. K., Cheng, J. & Gordon, J. |. Identifying gut 
microbe-host phenotype relationships using combinatorial communities in 
gnotobiotic mice. Sci. Transl. Med. 6, 220ral1 (2014). 

104.Coombes, J. L. et al. A functionally specialized population of mucosal CD103* 
DCs induces Foxp3* regulatory T cells via a TGF-8 and retinoic acid-dependent 
mechanism. J. Exp. Med. 204, 1757-1764 (2007). 

105.Sun, C. M. et al. Small intestine lamina propria dendritic cells promote de novo 
generation of Foxp3 T reg cells via retinoic acid. J. Exp. Med. 204, 1775-1785 
(2007). 

106.Mortha, A. et a/. Microbiota-dependent crosstalk between macrophages and 
ILC3 promotes intestinal homeostasis. Science 343, 1249288 (2014). 

107.Loschko, J. et al. Absence of MHC class Il on cDCs results in microbial-dependent 
intestinal inflammation. J. Exp. Med. 213, 517-534 (2016). 


84 | NATURE | VOL 535 | 7 JULY 2016 


108. Stary, G. et al. A mucosal vaccine against Chlamydia trachomatis generates two 
waves of protective memory T cells. Science 348, aaa8205 (2015). 

109. Olszak, T. et a/. Microbial exposure during early life has persistent effects on 
natural killer T cell function. Science 336, 489-493 (2012). 

110.Scharschmicdt, T. C. et a/. A wave of regulatory T cells into neonatal skin mediates 

tolerance to commensal microbes. /mmunity 43, 1011-1021 (2015). 

111.Russell, S. L. et al. Early life antibiotic-driven changes in microbiota enhance 

susceptibility to allergic asthma. EMBO Rep. 13, 440-447 (2012). 

112.Hill, D. A. et al. Commensal bacteria-derived signals regulate basophil 

hematopoiesis and allergic inflammation. Nature Med. 18, 538-546 (2012). 

113.Cahenzli, J., Koller, Y., Wyss, M., Geuking, M. B. & McCoy, K. D. Intestinal microbial 

diversity during early-life colonization shapes long-term IgE levels. Cell Host 

Microbe 14, 559-570 (2013). 

114. Arrieta, M. C. et a/. Early infancy microbial and metabolic alterations affect risk of 

childhood asthma. Sci. Transl. Med. 7, 307ra152 (2015). 

115.Devkota, S. et al. Dietary-fat-induced taurocholic acid promotes pathobiont 

expansion and colitis in //10 mice. Nature 487, 104-108 (2012). 

116.Small, C. L., Reid-Yu, S. A., McPhee, J. B. & Coombes, B. K. Persistent infection 

with Crohn’s disease-associated adherent-invasive Escherichia coli leads to 

chronic inflammation and intestinal fibrosis. Nature Commun. 4, 1957 (2013). 

117.Frank, D. N. et al. Disease phenotype and genotype are associated with shifts in 

intestinal-associated microbiota in inflammatory bowel diseases. Inflamm. Bowel 

Dis. 17, 179-184 (2011). 

118.Ramanan, D., Tang, M.S., Bowcutt, R., Loke, P. & Cadwell, K. Bacterial sensor 

Nod2 prevents inflammation of the small intestine by restricting the expansion 

of the commensal Bacteroides vulgatus. Immunity 41, 311-324 (2014). 

119.Gevers, D. et a/. The treatment-naive microbiome in new-onset Crohn’s disease. 
Cell Host Microbe 15, 382-392 (2014). 

120.Scher, J. U. et al. Expansion of intestinal Prevotella copri correlates with enhanced 
susceptibility to arthritis. eLife 2,e01202 (2013). 

121.Vujkovic-Cvijin, |. et al. Dysbiosis of the gut microbiota is associated with HIV 
disease progression and tryptophan catabolism. Sci. Transl. Med. 5, 193ra91 
(2013). 

122.Hand, T. W. et al. Acute gastrointestinal infection induces long-lived microbiota- 
specific T cell responses. Science 337, 1553-1556 (2012). 

123.Cong, Y., Feng, T., Fujihashi, K., Schoeb, T. R. & Elson, C. 0. A dominant, 
coordinated T regulatory cell-IlgA response to the intestinal microbiota. Proc. Natl 
Acad. Sci. USA 106, 19256-19261 (2009). 

124.Lodes, M. J. et al. Bacterial flagellin is a dominant antigen in Crohn disease. 
J. Clin. Invest. 113, 1296-1306 (2004). 

125.Viaud, S. et al. The intestinal microbiota modulates the anticancer immune 
effects of cyclophosphamide. Science 342, 971-976 (2013). 

126.Vétizou, M. et al. Anticancer immunotherapy by CTLA-4 blockade relies on the 
gut microbiota. Science 350, 1079-1084 (2015). 

127.Charbonneau, M. R. et al. Sialylated milk oligosaccharides promote microbiota- 
dependent growth in models of infant undernutrition. Cell 164, 859-871 
(2016). 

128.Cao, A. T. et al. Interleukin (IL)-21 promotes intestinal IgA response to microbiota. 
Mucosal Immunol. 8, 1072-1082 (2015). 

129.Kruglov, A. A. et al. Nonredundant function of soluble LTa3 produced by innate 
lymphoid cells in intestinal homeostasis. Science 342, 1243-1246 (2013). 

130.Sonnenberg, G. F., Monticelli, L. A., Elloso, M. M., Fouser, L. A. & Artis, D. CD4* 
lymphoid tissue-inducer cells promote innate immunity in the gut. /mmunity 34, 
122-134 (2011). 

131.Longman, R. S. et al. CX3;CR1* mononuclear phagocytes support colitis- 
associated innate lymphoid cell production of IL-22. J. Exp. Med. 211, 
1571-1583 (2014). 

132.Cadwell, K. et al. Virus-plus-susceptibility gene interaction determines Crohn’s 
disease gene Atg16L1 phenotypes in intestine. Ce// 141, 1135-1145 (2010). 

133.Kernbauer, E., Ding, Y. & Cadwell, K. An enteric virus can replace the beneficial 
function of commensal bacteria. Nature 516, 94-98 (2014). 

134. Naik, S. et al. Commensal-dendritic-cell interaction specifies a unique protective 
skin immune signature. Nature 520, 104-108 (2015). 

135.Ichinohe, T. et al. Microbiota regulates immune defense against respiratory tract 
influenza A virus infection. Proc. Nat! Acad. Sci. USA 108, 5354-5359 (2011). 


Acknowledgements This work was supported by: grants from the Japan Agency for 
Medical Research and Development (AMED) and the Takeda Science Foundation 
(K.H.); US National Institutes of Health grant RO1DK103358 and the Howard Hughes 
Medical Institute (D.R.L.). 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare competing financial interests: see 
go.nature.com/28j4zjb. Readers are welcome to comment on the online version of 
this paper at go.nature.com/28j4zjb. Correspondence should be addressed to K.H. 
(kenya@keio.jp) and D.L. (dan.littman@med.nyu.edu). 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


doi:10.1038/nature18849 


Interactions between the microbiota 
and pathogenic bacteria in the gut 


Andreas J. Baumler' & Vanessa Sperandio”* 


The microbiome has an important role in human health. Changes in the microbiota can confer resistance to or promote 
infection by pathogenic bacteria. Antibiotics have a profound impact on the microbiota that alters the nutritional land- 
scape of the gut and can lead to the expansion of pathogenic populations. Pathogenic bacteria exploit microbiota- derived 
sources of carbon and nitrogen as nutrients and regulatory signals to promote their own growth and virulence. By eliciting 
inflammation, these bacteria alter the intestinal environment and use unique systems for respiration and metal acquisi- 
tion to drive their expansion. Unravelling the interactions between the microbiota, the host and pathogenic bacteria will 
produce strategies for manipulating the microbiota against infectious diseases. 


ppreciation of the important role of the microbiota in human 

health and nutrition has grown steadily in the past decade. Initial 

studies focused on cataloguing the microbial species that com- 
prise the microbiota and correlating the composition of the microbiota 
with the health or disease state of the host. The present period of renais- 
sance has resulted in technologies and interdisciplinary research that are 
conducive to mechanistic studies and, in particular, those that focus on 
associations between the microbiota, the host and pathogenic bacteria. 
Exciting research is now starting to unravel how the composition of the 
microbiota can offer either resistance or assistance to invading pathogenic 
species. The majority of these studies were conducted in the gastrointes- 
tinal tract, in which associations between the host and microbes are of 
paramount importance. The gut microbiota of each individual is unique at 
the genus and species levels; however, itis more generally conserved at the 
phylum level, which is populated most prominently by Bacteroidetes and 
Firmicutes, followed by Proteobacteria and Actinobacteria. Host genetics, 
diet and environmental insults such as treatment with antibiotics alter 
the microbiota’, which can lead to varying susceptibility to infectious 
diseases between individuals’. 

The microbiota can promote resistance to colonization by pathogenic 
species””. For instance, mice that are treated with antibiotics or that are 
bred in sterile environments (known as germ-free mice) are more sus- 
ceptible to enteric pathogenic bacteria such as Shigella flexneri, Citrobac- 
ter rodentium, Listeria monocytogenes and Salmonella enterica serovar 
Typhimurium’. And some microbiotas can lead to the expansion or 
enhanced virulence of pathogenic populations’. A notable example con- 
cerns how differences in the composition of microbiotas determine the 
susceptibility of the mice to infection with C. rodentium: the transplanta- 
tion of microbiotas from strains of mice that are susceptible to infection 
induced similar susceptibility in animals that were previously insuscep- 
tible, and the transplantation of microbiotas from resistant animals led to 
resistance to infection in previously susceptible animals'*”*. Epidemiolog- 
ical surveys reinforce this idea. For example, differential susceptibility to 
infection with Campylobacter jejuni was shown to depend on the species 
composition of the microbiotas ina study of Swedish adults”®. Individuals 
with a higher diversity within their microbiotas, and with an abundance of 
bacteria from the genera Dorea and Coprococcus, were significantly recal- 
citrant to C. jejuni infection compared with people who had low-diversity 
microbiotas and non-abundance of Dorea and Coprococcus. 

The host's diet profoundly affects the composition of the microbiota, 


with repercussions for the physiology, immunity and susceptibility to 
infectious diseases of the host’’. Dietary choices have been shown to affect 
colonization by enterohaemorrhagic Escherichia coli (EHEC) serotype 
O157:H7 and the severity and length of its resulting disease’, and sup- 
plementation of the diet with phytonutrients promotes the expansion 
of beneficial Clostridia species that protect mice from colonization by 
C. rodentium”. 

The use of innovative technologies, in combination with more con- 
ventional approaches, is driving our understanding of the interactions 
between the microbiota, the host and pathogenic bacteria. The genetic 
tractability of several species of bacteria, as well as of their mammalian 
hosts (such as mice), allows for the mechanistic investigation of these rela- 
tionships. The investigation of changes in the composition of microbiotas 
has been driven by next-generation sequencing, which also facilitated the 
analysis of transcriptomes. The growing power and finesse of metabo- 
lomics studies are quickly expanding our knowledge of the impact of both 
the microbiota and of pathogenic bacteria on the metabolic landscape of 
the gut. Here, we review advances in our understanding of the complex 
relationships that determine the severity and outcome of gastrointesti- 
nal infections. The majority of the mechanistic studies that investigate 
these interactions have been conducted in S. Typhimurium, EHEC and 
Clostridium difficile: therefore, these pathogenic organisms are covered 
more extensively than others in this Review. 


Antibiotics 

Antibiotics revolutionized medicine and were justifiably dubbed ‘magic 
bullets’ against bacterial infections. However, conventional antibiotics 
are generally bacteriostatic or bactericidal, which means that they indis- 
criminately kill or prevent the growth of both pathogenic and beneficial 
microbes. Antibiotics can alter the taxonomic, genomic and functional 
features of the microbiota, and their effects can be rapid and sometimes 
everlasting”. They can decrease the diversity of the microbiota, which 
compromises resistance to colonization by incoming pathogenic bacte- 
ria” — most notably leading to an expansion of C. difficile that can cause 
diarrhoea that leads to potentially fatal colitis”. 

C. difficile is a spore-forming bacterium that, on germination, colonizes 
the large intestine and causes colitis through the action of two toxins: 
TcdA and TcdB. The majority of C. difficile infections are nosocomial, 
but there has also been an increase in community-acquired infections, 
mainly due to the ubiquitous presence of C. difficile spores. C. difficile can 


1Department of Medical Microbiology and Immunology, University of California, Davis, School of Medicine, Davis, California 95616, USA. Department of Microbiology, University of Texas 
Southwestern Medical Center, Dallas, Texas 75390-9048, USA. *Department of Biochemistry, University of Texas Southwestern Medical Center, Dallas, Texas 75390-9038, USA. 


7 JULY 2016 | VOL 535 | NATURE | 85 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


a Diverse microbiota 


Lumen 


Outer 
mucus layer 


Inner 
mucus layer 


Intestinal 


epithelium 


b Reduced diversity of microbiota Fatal colitis 


Colonization by. 
C. difficile * 


Antibiotics 


Gastroenteritis Fatal colitis 
c II 
Bwted © ° 
T Sialic acid @ Pot eee o) Succinate 
~ @ Gee @ 
Antibiotics -s y lS 


S. Typhimurium 


* Invasion of host cell 
+ Inflammation 


Figure 1 | The impact of antibiotics on the microbiota and the expansion 
of enteric pathogens. a, A diverse and non-disturbed microbiota confers 
resistance to colonization by enteric pathogens in the intestinal epithelium. 
b, Treatment with antibiotics decreases the diversity of the microbiota and 
leads to expansion of the C. difficile population. Toxins that are released from 
C. difficile (TcdA and TcdB) enter and damage the cells of the epithelium, 
which leads to inflammation (colitis) and cell death. c, Treatment with 
antibiotics also leads to an increase in the levels of free sialic acid (from the 
host) and succinate (from the microbiota) in the lumen of the intestine. 
Elevated sialic acid promotes the expansion of the S. Typhimurium 
population, which can lead to inflammation (gastroenteritis) if the bacterium 
invades the cells of the intestinal epithelium. Elevated levels of sialic acid and 
succinate further promote the expansion of the C. difficile population and the 
development of colitis and cell death. 


86 | NATURE | VOL 535 | 7 JULY 2016 


colonize the mammalian intestine without causing disease, but one of the 
most important risk factors for colitis that is mediated by C. difficile is the 
use of antibiotics”. The antibiotics-mediated loss of resistance to coloni- 
zation also allows colonization by S. Typhimurium and the development 
of disease”. Both C. difficile and S. Typhimurium catabolize sialic acid 
as a source of carbon in the lumen to promote their expansion”. They 
rely on saccharolytic members of the microbiota, such as Bacteroides the- 
taiotaomicron, to make this sugar freely available in the intestinal lumen. 
Treatment with antibiotics increases the abundance of host-derived free 
sialic acid as well as enhancing its release into the lumen by B. thetaio- 
taomicron, which promotes the expansion of the two pathogenic bacte- 
ria”. Antibiotic use also triggers production of the organic acid succinate, 
another microbiota-derived nutrient that confers a growth advantage to 
C. difficile. It is often present at a low concentration in the microbiotas of 
conventional mice, but its presence increases on treatment with antibiot- 
ics, which promotes a bloom of C. difficile” (Fig. 1). 

Knowledge of how microbiota disruption affects the ability of bona fide 
or opportunistic pathogenic organisms to infect hosts is still in its infancy. 
However, two underlying themes converge: microbiota-induced changes 
in the metabolite landscape of the gut and inflammation. 


Utilization of nutrients 

Simple dietary sugars are absorbed in the small intestine, which means 
that they are unavailable as sources of carbon for the microbiota and 
pathogenic bacteria in the colon. The most abundant members of the 
microbiota are those that are able to utilize the undigested plant polysac- 
charides and host glycans that are present in the colon”. 

The gut epithelium is protected by a layer of mucus that is composed 
of proteins known as mucins that are rich in fucose, galactose, sialic acid, 
N-acetylgalactosamine, N-acetylglucosamine and mannose. These sug- 
ars are harvested by saccharolytic members of the microbiota, such as 
Bacteroidales in the gut, which makes them available to species within 
the microbiota that lack this capability”°. However, pathogenic bacteria 
in the gut can also exploit the availability of these sugars to promote their 
own expansion. Several studies have used B. thetaiotaomicron as a model 
Bacteroides in which to investigate these syntrophic links. Sialic acid is 
a terminal sugar of some mucosal glycans, and B. thetaiotaomicron has 
sialidase activity but lacks the catabolic pathway for sialic-acid utilization. 
The bacterium therefore releases sialic acid to gain access to underlying 
glycans that it can use as a source of carbon. The sialic acid that B. thetaio- 
taomicron releases from the mucus can be catabolized by both C. difficile 
and S. Typhimurium, which provides them with a growth advantage”’. 
The ability of the microbiota to use sialic acid therefore depends on 
the action of B. thetaiotaomicron, and mutants that lack sialidase fail to 
enhance the growth of these two pathogenic bacteria”. 

B. thetaiotaomicron also releases fucose from the mucus. It harbours 
multiple enzymes that can cleave fucose from host glycans, so its presence 
results in the high availability of fucose in the lumen of the gut”. This 
free fucose can also be used as a source of carbon by S. Typhimurium”. 
Importantly, B. thetaiotaomicron can promote the fucosylation of mucosal 
glycans when introduced into monoassociated germ-free mice”, 

The microbiota resides in the lumen and the outer mucus layer of 
the intestine. EHEC, however, aims to achieve a unique niche by closely 
adhering to the enterocytes of the intestinal epithelium. To achieve its 
goal, EHEC must successfully compete with the microbiota for nutri- 
ents. B. thetaiotaomicron does not need to compete with EHEC, however, 
because it can utilize polysaccharides; EHEC can only utilize monosac- 
charides and disaccharides'**’. EHEC’s main competitors are commensal 
E. coli, which preferentially utilizes fucose as a source of carbon when 
growing in the mammalian intestine’*’. To circumvent this competition, 
EHEC utilizes other sources of sugar, such as galactose, the hexuranates, 
mannose and ribose, which commensal E. coli cannot catabolize opti- 
mally**** (Fig. 2). 

EHEC uses fucose as a signalling molecule with which to adjust its 
metabolism and to regulate the expression of its virulence repertoire in the 
lumen and the outer mucus layer of the colon”. It horizontally acquired 


© 2016 Macmillan Publishers Limited. All rights reserved. 


a pathogenicity island of genes that encode a fucose-sensing signalling- 
transduction system”. This system is unique to EHEC and to C. roden- 
tium’> (which is used extensively in mouse models as a surrogate for the 
human pathogen EHEC”). It is composed of the membrane-bound 
histidine sensor kinase FusK, which specifically autophosphorylates 
in response to fucose. FusK then transfers its phosphate to a response 
regulator called FusR, which is a transcription factor. Phosphorylation 
activates FusR, which represses the expression of the fucose utilization 
genes in EHEC, and helps EHEC to avoid the need to compete for this 
nutrient with commensal E. coli’*. To prevent the unnecessary expendi- 
ture of energy by EHEC, FusR represses the genes that encode the EHEC 
virulence machinery, a syringe-like apparatus known as a type III secre- 
tion system (T3SS), which the bacterium uses to adhere itself to entero- 
cytes and highjack the function of these host cells”. EHEC therefore uses 
fucose, a host-derived signal that is made available by the microbiota, to 
sense the environment of the intestinal lumen and to modulate its own 
metabolism and virulence. 

To reach the lining of the epithelium, EHEC and C. rodentium produce 
mucinases”, which cleave the protein backbone of mucin-type glyco- 
proteins. Expression of these enzymes is increased by metabolites that 
are produced by B. thetaiotaomicron’™®. Because mucus is one of the main 
sources of sugar in the colon, where EHEC and C. rodentium colonize, 
obliteration of the mucus layer creates a nutrient-poor environment near 
the epithelium that is referred to as gluconeogenic. The colonization of 
mice by B. thetaiotaomicron therefore profoundly changes the metabolic 
landscape of the mouse gut because it raises the levels of organic acids such 
as succinate”™***’. Moreover, several metabolites that indicate a gluconeo- 
genic environment, such as lactate and glycerate, are also elevated”. EHEC 
and C. rodentium sense this gluconeogenic and succinate-rich environ- 
ment through the transcriptional regulator Cra. On receiving the cue that 
they have reached the lining of the gut epithelium, these bacteria activate 
the expression of their T3SSs**. EHEC therefore exploits metabolic cues 
from B. thetaiotaomicron, and probably other members of the microbiota, 
to precisely programme its metabolism and virulence (Fig. 2). 

Other pathogenic bacteria can also adjust their gene expression in the 
presence of microbiota-produced succinate. C. difficile induces a pathway 
that converts succinate to butyrate, which confers a growth advantage 
in vivo™. Populations of C. difficile mutants that are unable to convert 


REVIEW 


succinate fail to expand in the gut in the presence of B. thetaiotaomicron™. 

Several short-chain fatty acids that are produced by the microbiota, 
are important determinants of interactions between the microbiota and 
pathogenic bacteria in the gut. The abundance and composition of short- 
chain fatty acids is distinct in each compartment of the intestine, and the 
ability to sense these differences might help pathogenic bacteria in niche 
recognition. The most abundant short-chain fatty acids in the gut are 
acetate, propionate and butyrate. S. Typhimurium preferably colonizes 
the ileum”, which generally contains acetate at a concentration of 30 mM. 
This acetate concentration enhances the expression of the S. Typhimu- 
rium Salmonella pathogenicity island 1 (SPI-1)-encoded T3SS (T3SS-1), 
which is involved in the bacterium’s invasion of the host. Conversely, 
70 mM propionate and 20 mM butyrate, concentrations typical of the 
colon, suppress the expression of the T3SS-1 (ref. 41). Propionate and 
butyrate seem to affect the T3SS-1 regulatory cascade at various levels. 
However, the detailed mechanism of this regulation is yet to be unravelled. 
In EHEC, exposure to the levels of butyrate found in the colon increases 
the expression of the EHEC T3SS through post-transcriptional activation 
of the transcriptional regulator Lrp*. Exposure to the concentrations of 
acetate and propionate that are found in the small intestine does not sig- 
nificantly affect the virulence of EHEC. 

Diet has a profound effect on the composition of the microbiota and the 
concentration of short-chain fatty acids in the gut”. A diet that is high in 
fibre results in the enhanced production of butyrate by the gut microbiota. 
That increases the host’s expression of globotriaosylceramide, which is a 
receptor for the Shiga toxin that is produced by EHEC™. Shiga toxin can 
lead to the development of haemolytic uraemic syndrome (HUS) and 
is the cause of the morbidity and mortality associated with outbreaks of 
EHEC”. Consequently, animals that are fed a high-fibre diet are more 
susceptible to Shiga toxin than are those on a low-fibre diet and develop 
more severe disease’*. Conversely, increased levels of microbiota-derived 
acetate protect animals from disease that is caused by the toxin. Certain 
species of Bifidobacteria contribute to higher levels of acetate in the gut, 
which helps to improve the barrier function of the intestinal epithelium 
and to prevent Shiga toxin from reaching the bloodstream™. 

Enteric pathogenic bacteria also use other nutrients to successfully 
overcome the microbiota’s resistance to their colonization. Ethanolamine 
is abundant in the mammalian intestine”. It can be used as a source of 


a EHEC 


: : Other members 
B. thetaiotaomicron of microbiota 


Fucose 


Outer 
mucus layer 


Inner 
mucus layer 


Intestinal 
epithelium 


Figure 2 | Modulation of enterohaemorrhagic E. coli virulence through 
nutrients provided by the microbiota. a, The microbiota resides in the 
lumen and outer mucus layer of the intestine. The saccharolytic bacterium 
Bacteroides thetaiotaomicron is a prominent member of the microbiota. It can 
release fucose from the mucus and makes the sugar available to other bacteria. 
When EHEC senses fucose through the FusKR signalling system, it represses 
both its use of the sugar and the expression of genes that encode the T3SS, a 
protein-translocation apparatus that enables the bacterium to secrete effector 
proteins into host cells. This repression prevents EHEC from competing for 


Fucose sensing 


b —» Mucinases 


© 
Succinate as 
@S * Lesion formation 
") : 


| FusR 


* Fucose utilization 
* T3SS gene expression 


ww « Attachment to 
host cell 


fucose with commensal E. coli and from expending energy unnecessarily on 
T3SS expression. b, Metabolites that are provided by B. thetaiotaomicron, such 
as succinate, lead to an increase in the expression by EHEC of the enzyme 
mucinase, which obliterates the mucus layers of the intestine. EHEC is then able 
to reach the intestinal epithelium. B. thetaiotaomicron then begins to secrete 
succinate and other metabolites that are required for gluconeogenesis into the 
now nutrient-poor environment. The compounds are sensed by EHEC, which 
upregulates its expression of the T3SS to enable the bacterium to attach to the 
epithelial cells of the host intestine and form lesions that cause diarrhoea. 


7 JULY 2016 | VOL 535 | NATURE | 87 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


carbon and of nitrogen by a number of pathogenic species”, and food- 
borne bacteria are particularly adept at using it. However, it cannot be 
metabolized by the majority of commensal species”. S. Typhimurium, 
EHEC and L. monocytogenes gain a growth advantage in the intestine 
through their ability to use this compound**. Ethanolamine is also 
used as a signal by EHEC and S. Typhimurium to activate the expression 
of virulence genes**’. And S. Typhimurium uses hydrogen produced 
by the microbiota as an energy source to enhance its growth during the 
initial stage of infection”. 

The exploitation of microbiota-derived molecules as both nutrients 
and signals is crucial for the successful infection of the host by pathogenic 
bacteria. Although such organisms have clearly developed many strate- 
gies through which to circumvent the microbiota resistance to coloniza- 
tion, and in many cases even employ its help, the microbiota pushes back, 
which creates an intense competition for resources. The ability of EHEC 
to colonize the intestine stems from differences in the sources of sugar 
that are used by EHEC and by commensal E. coli. For example, the pres- 
ence of multiple strains of commensal E. coli with overlapping nutritional 
requirements interferes with the colonization of the mouse intestine by 
EHEC”. This study uses a streptomycin-treated mouse model of EHEC 
and three distinct commensal strains of E. coli to assess differential sugar 
requirements for the successful colonization of the intestines”. EHEC 
could colonize mice that were pre-colonized with any one of the com- 
mensal strains, but it could not colonize mice that were pre-colonized 
with all three strains”. EHEC has evolved to exploit distinct sources of 
sugar during colonization of the gut. It utilizes catabolic pathways for the 
hexuronates glucuronate and galacturonate and for sucrose that are not 
employed by commensal E. coli within the gut***’. It can also metabolize 
several sugars simultaneously. The loss of multiple catabolic pathways 
has an additive effect on colonization. This phenomenon is not observed 
in commensal E. coli, however, which suggests that E. coli uses available 
sugars in a stepwise fashion™. EHEC therefore differs from commensal 
E. coli in metabolic strategy and the use of nutrients for the colonization 
of the mammalian intestine. 

C. rodentium is outcompeted and then cleared from the mouse gut 
through a bloom in the population of commensal E. coli, which com- 
petes with C. rodentium for monosaccharides for nutrition”. By contrast, 
C. rodentium is not cleared by B. thetaiotaomicron in germ-free mice that 
are fed a diet that contains both monosaccharides, which can be used by 
Enterobacteriacae such as C. rodentium, and polysaccharides, which can 
be used by Bacteroides. However, when the mice are switched to a diet that 
consists only of monosaccharides, B. thetaiotaomicron and C. rodentium 
are forced to compete for sugars, and B. thetaiotaomicron outcompetes 
C. rodentium'’. The ability of pathogenic bacteria to successfully com- 
pete with commensal species for nutrients is therefore important for their 
establishment in the gut. 


Interception of signals from the microbiota and the host 
The microbiota affects the risks and courses of enteric diseases. Vibrio 
cholerae is a major cause of explosive diarrhoea in which there is extensive 
disruption of the intestinal population of microbes. Metagenomic stud- 
ies of the faecal microbiota of people with cholera in Bangladesh show 
that recovery is characterized by a certain microbiota signature. Recon- 
stitution of this microbiota in germ-free mice restricts the infectivity of 
V. cholerae. Specifically, the presence of Ruminococcus obeum can hamper 
the colonization of the intestines by V. cholerae through the production of 
the furanone signal autoinducer-2, which causes the repression of several 
V. cholerae colonization factors”. 

Another example of the effect of microbiota-derived signals on host 
colonization is their use by EHEC in the colonization of its ruminal res- 
ervoir. EHEC exclusively colonizes the recto-anal junction of adult cattle. 
Through the sensor protein SdiA, EHEC detects acyl-homoserine lactone 
signals from the rumen microbiota, which it uses to reprogram itself to 
survive the acidic pH of the animal's stomachs and to successfully colonize 
the rectoanal junction™. 

As wellas being able to directly detect signals that are derived from the 


88 | NATURE | VOL 535 | 7 JULY 2016 


microbiota, pathogenic bacteria can detect host-derived signals that have 
been modified by the microbiota to modulate their virulence. V. cholerae 
has a type VI secretion system (T6SS), which it uses to kill other bacteria. 
During its colonization of the intestine, V. cholerae comes in contact with 
the mucosal microbiota, which can affect the composition of bile acids in 
the intestine. For example, Bifidobacterium bifidum negatively regulates 
the T6SS activity of V. cholerae through the metabolic conversion of three 
bile acids (glycodeoxycholic acid, taurodeoxycholic acid and cholic acid) 
into the bile acid deoxycholic acid. Deoxycholic acid, but not its unmodi- 
fied salts, decreases the expression of T6SS genes. This leads to a decrease 
in the killing of E. coli by V. cholerae owing to bile-acid conversion by other 
commensals, which decreases the activity of the T6SS”. 

Another microbiota-modified host signal that is detected by pathogenic 
bacteria is the neurotransmitter noradrenaline. The gut is highly inner- 
vated, and neurotransmitters are important signals in the gastrointestinal 
tract, where they modulate peristalsis, the flow of blood and the secretion 
ofions™. The microbiota affects the availability of neurotransmitters in the 
intestinal lumen, as well as their biosynthesis. For example, the microbiota 
induces biosynthesis of serotonin”, and microbiota-derived enzymatic 
activities increase the levels of active noradrenaline in the gut lumen”. 
Noradrenaline is synthesized by the adrenergic neurons of the enteric 
nervous system” and it is inactivated by the host through conjugation 
with glucuronic acid (to produce a glucuronide). Microbiota-produced 
enzymes known as glucuronidases then deconjugate glucuronic acid from 
noradrenaline, which increases the amount of active noradrenaline in the 
lumen of the intestine®. Several pathogenic bacteria of the gut, including 
EHEC, S. Typhimurium and V. parahaemolyticus, sense noradrenaline 
to activate the expression of virulence genes” **. Two adrenergic sensors 
have been identified in bacteria: the membrane-bound histidine kinases 
QseC and QseE™*”. QseC also detects the microbiota-produced signal 
autoinducer-3 (refs 64 and 66), so the sensing of signals from both the 
host and the microbiota converge at the level ofa single receptor, a process 
known as inter-kingdom signalling. 


Inflammation 

Although diet and the composition of the microbiota heavily influence 
the availability of nutrients in the gut, the host also has an important 
part to play. A crucial driver of changes in the gut environment is the 
inflammatory response of the host. Intestinal inflammation in people is 
associated with an imbalance in the microbiota, known as dysbiosis, and 
is characterized by a reduced diversity of microbes, a reduced abundance 
of obligate anaerobic bacteria and an expansion of facultative anaerobic 
bacteria in the phylum Proteobacteria, mostly members of the family 
Enterobacteriaceae*”*. Similar changes in the composition of the gut 
microbiota are observed in mice with chemically induced colitis” and 
genetically induced colitis”. These changes in the structure of the micro- 
biota probably reflect an altered nutritional environment that is created 
by the inflammatory response of the host. 

The availability of nutrients in the large intestine is altered during 
inflammation through changes in the composition of mucous carbo- 
hydrates. Interleukin (IL)-22, a cytokine that is prominently induced in 
the intestinal mucosa when mice and rhesus macaques are infected with 
S. Typhimurium”*”, stimulates the epithelial expression of galactoside 
2-a-L-fucosyltransferase 2 and enhances the a(1,2)-fucosylation of mucus 
carbohydrates”*””. The gut microbiota can liberate fucose from mucus 
carbohydrates”*®, which leads to the induction of genes for fucose utili- 
zation in E. coli”*. Similarly, increased fucosylation of glycans is observed 
during S. Typhimurium-induced colitis in mice, which correlates with 
elevated synthesis of the proteins involved in fucose utilization®’. Mucus 
fucosylation that is induced during infection with C. rodentium causes 
changes in the composition of the gut microbiota that help to protect the 
host from the expansion and epithelial translocation of the pathobiont 
Enterococcus faecalis”. 

Another driver of changes in the nutritional environment of the gut is 
the generation of reactive oxygen species and reactive nitrogen species 
during inflammation. Pro-inflammatory cytokines such as interferon-y 


© 2016 Macmillan Publishers Limited. All rights reserved. 


Salmochelin 


Commensal 
Enterobacteriaceae aly 


Enterobactin 


~9 


rp -2 


S. Typhimurium 


Intestinal 
epithelium 


R 


Lumen 


Macrophage 


Figure 3 | The effect of intestinal inflammation on nutrient availability. 
S. Typhimurium uses its virulence factors (T3SS-1 and T3SS-2) to trigger 
intestinal inflammation. Cytokines that are released during inflammation, 
such as IL-22 and IFN-y, trigger the release of antimicrobial molecules 
lipocalin-2, reactive oxygen species (ROS) and reactive nitrogen species 
(RNS) from the intestinal epithelium. Lipocalin-2 can block the growth of 
commensal Enterobacteriaceae that rely on the siderophore enterobactin 
for the acquisition of iron (Fe**). It does not bind to the S. Typhimurium 


(IFN-y) activate dual oxidase 2 in the intestinal epithelium, which pro- 
duces hydrogen peroxide”. Increased expression of DUOX2, the gene that 
encodes dual oxidase 2, in the intestinal mucosa of patients with Crohn's 
disease and ulcerative colitis correlates with an expansion of Proteobac- 
teria in the gut microbiota®. IFN-y also induces epithelial expression of 
the gene Nos2 (ref. 84), which encodes inducible nitric oxide synthase, the 
enzyme that catalyses the production of nitric oxide from L-arginine”. 
As a result, the concentration of nitric oxide is elevated in gases from the 
colons of people with inflammatory bowel disease****. Although reactive 
oxygen and nitrogen species have antimicrobial activity, these radicals 
quickly form non-toxic compounds in the lumen of the gut as they diffuse 
away from the epithelium. For example, when they are generated during 
inflammation by host enzymes in the intestinal epithelium, these spe- 
cies react to form nitrate®’. This by-product of inflammation is present at 
elevated concentrations in the intestines of mice with chemically induced 
colitis” (Fig. 3). Nitrate reductases, enzymes that are broadly conserved 
among the Enterobacteriaceae, couple the reduction of nitrate to energy- 
conserving electron transport systems for respiration, a process termed 
nitrate respiration. However, the genes that encode them are absent from 
the genomes of obligate anaerobic Clostridia or Bacteroidia”’. Nitrate res- 
piration drives the Nos2-dependent expansion of commensal E. coli in 
mice with chemically or genetically induced colitis, but not in animals 
without signs of intestinal inflammation”. Respiratory electron accep- 
tors that are generated as a by-product of the host inflammatory response 
therefore create a niche in the lumen of the intestines that supports the 
uncontrolled expansion of commensal Enterobacteriaceae rather than 
of obligate anaerobic bacteria”’. The resulting bloom in the inflamed 


im 


REVIEW 


Tetrathionate 
respiration 


4 


Thiosulfate 


Desulfovibrio 


Hydrogen 
sulfide 


Nitrate Nitrite Tetrathionate 


Neutrophil 


siderophone salmochelin, however, which confers the bacterium with 
resistance to its effects on growth. RNS and ROS react to form nitrate, 

which drives the growth of Enterobacteriaceae through nitrate respiration. 
Microbiota-derived hydrogen sulfide is converted to thiosulfate by colonic 
epithelial cells. Neutrophils that migrate into the lumen of the intestine during 
inflammation generate ROS that convert endogenous sulfur compounds 
(thiosulfate) into an electron acceptor (tetrathionate) that further boosts the 
growth of S. Typhimurium through tetrathionate respiration. 


intestine is one of the most consistent and robust ecological patterns that 
has been observed in the gut microbiota”. 

The creation ofa niche for respiratory nutrients during inflammation 
is also an important driver of the strategies that pathogenic bacteria from 
the family Enterobacteriaceae use to invade the gut ecosystem. In the 
absence of inflammation or treatment with antibiotics, members of the 
gut microbiota occupy all available nutrient niches, which makes it very 
challenging for pathogenic Enterobacteriaceae to enter the community. 
One solution is for these bacteria to trigger intestinal inflammation, which 
would coerce the host into creating a fresh niche of respiratory nutrients 
that is suitable for its expansion — an approach that is used by S. Typh- 
imurium™. On ingestion, S. Typhimurium uses T3SS-1 to invade the 
intestinal epithelium” and T3SS-2 to survive in the tissue of the host”. 
Both of these processes trigger acute intestinal inflammation in cattle 
and in mouse models of gastroenteritis” ** (Fig. 3). The inflammatory 
response of the host drives the expansion of S. Typhimurium in the lumen 
of the gut”, which is required for the transmission of this pathogenic spe- 
cies to a new host through the faecal-oral route’. 

Although such expansion allows S. Typhimurium to side-step com- 
petition with obligate anaerobic Clostridia and Bacteroidia, this strategy 
forces the bacterium into battle with commensal Enterobacteriaceae 
over limited resources. For example, S. Typhimurium expands in the 
inflamed gut through nitrate respiration’’’’”, which results in rivalry 
with commensal Enterobacteriaceae that pursue a similar strategy”. 
S. Typhimurium can gain an edge in this competition through its ability 
to utilize a broader range of inflammation-derived electron acceptors 
than its rivals. A source of one such electron acceptor is sulfate-reducing 


7 JULY 2016 | VOL 535 | NATURE | 89 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


BOX1 


An imbalance in the gut microbiota might underlie many human 
diseases but, in most cases, the development of treatment options 
is still in its infancy. This could be in part because the mechanisms 
that lead to adverse effects in the host differ for each disease, which 
means that intervention strategies must be developed for each. The 
treatment options for antibiotic-induced dysbiosis are perhaps the 
most advanced, mainly because faecal microbiota transplantation 
can reverse this imbalance in the gut microbiota!”’. Nonetheless, the 
mechanisms through which treatment with antibiotics encourages 
an uncontrolled expansion of the obligate anaerobe C. difficile differ 
markedly from those that stimulate the growth of the facultative 
anaerobes Enterobacteriaceae, which has implications for the 
development of precision microbiome interventions. 

Mice that are treated with streptomycin have a reduced abundance 
of members of the class Clostridia!®°, which are credited with 
producing the lion’s share of the short-chain fatty acid butyrate in the 
large intestine'?. The resulting depletion of short-chain fatty acids 


Intestinal 
wall 


Unconjugated Secondary 
Conjugated primary bile salts bile salts 
primary bile 


* Cholate + Deoxycholate 
* Chenodeoxycholate + Lithocholate 


Microbiota interventions as therapeutic 
strategies to limit pathogen expansion 


drives an expansion of Enterobacteriaceae through mechanisms that 
are not fully resolved**"*?, Depletion of Clostridia-derived butyrate 
affects the metabolism of enterocytes in the colon, which derive most 
of their energy by butyrate respiration'*’. The depletion of short-chain 
fatty acids also leads to a contraction in the pool of regulatory T cells 
in the colonic mucosa‘**"°°, These changes in the host physiology 
increase the inflammatory tone of the mucosa, as indicated by the 
elevated expression of Nos2, the gene that encodes inducible nitric 
oxide synthase, and contributes to the expansion of commensal E. coli 
through nitrate generation’*’. Although other mechanisms probably 
contribute to the post-antibiotic expansion of certain populations of 
bacteria in the gut!”°, the transfer of Clostridia, with their capacity 

for producing short-chain fatty acids, represents the most effective 
treatment for limiting the growth of E. coli in streptomycin-treated 
mice! 

By contrast, the post-antibiotic expansion of the C. difficile 
population is driven by a depletion of secondary bile salts. The liver 
produces the primary bile salts cholate and chenodeoxycholate, which 
are conjugated to the amino acids taurine (to produce taurocholate 
and taurochenodeoxycholate) or glycine (to produce glycocholate 
and glycochenodeoxycholate) and then secreted into the gut. Bile 
salt hydrolases, enzymes that are produced by many members of the 
gut microbiota, remove the conjugated amino acid from the primary 
bile salt. C. scindens is one of a limited number of species of bacteria 
that can actively transport cholate and chenodeoxycholate into its 
cytosol, where these unconjugated primary bile salts are converted 
into the secondary bile salts deoxycholate and lithocholate, which 
are subsequently secreted into the extracellular environment’? 

(Box Fig.). Although both primary and secondary bile salts induce the 
germination of C. difficile spores, only secondary bile salts efficiently 
prevent the growth of vegetative C. difficile cells'*°. By significantly 
reducing the abundance of species that are capable of producing 
deoxycholate and lithocholate, treatment with antibiotics causes a 
depletion of these secondary bile salts and promotes the expansion 

of vegetative C. difficile cells in the large intestine!*!"4*. Faecal 
microbiota transplantation restores the production of secondary bile 
salts and therefore prevents the expansion of C. difficile’“’. Direct 
supplementation of the diet with secondary bile salts warrants caution 
because increased concentrations of bile salts have been linked 

to gastrointestinal cancers’. However, inoculation with only the 
secondary-bile-salt-producing C. scindens confers mice with resistance 
to C. difficile expansion following treatment with antibiotics’®®. This 
remarkable observation opens the door to novel precision microbiome 
interventions that aim to prevent or treat the colitis that is associated 
with C. difficile infection after antibiotic therapy. 


species of Desulfovibrio from the microbiota, which release hydrogen 
sulfide, a compound that is converted to thiosulfate by the epithelium 
of the colon to avoid toxicity’. Deployment of the virulence factors of 
pathogenic bacteria leads to the recruitment of neutrophils to the intesti- 
nal mucosa, which is the histopathological hallmark of S. Typhimurium- 
induced gastroenteritis”. A fraction of these recruited neutrophils migrate 
into the lumen of the intestine — a diagnostic marker of inflammatory 
diarrhoea. In the lumen, neutrophils help to protect the mucosa by 
engulfing bacteria in the vicinity of the epithelium’, but reactive oxygen 
species that are generated by the phagocyte-produced NADPH oxidase 2 
(also known as cytochrome b-245 heavy chain) convert thiosulfate into 
tetrathionate, a respiratory electron acceptor that supports the expansion 


90 | NATURE | VOL 535 | 7 JULY 2016 


of S. Typhimurium in the lumen of the inflamed gut’ (Fig. 3). Although 
tetrathionate respiration is a characteristic of Salmonella serovars and has 
been used empirically in their isolation in clinical microbiology labora- 
tories since 1923 (ref. 107), insights into the respiratory nutrient niche 
that Salmonella occupies suggest that this property is part ofa strategy to 
edge out competing commensal Enterobacteriaceae in the inflamed gut™. 

The inflammatory response of the host also ignites competition 
between commensal and pathogenic Enterobacteriaceae over trace ele- 
ments such as iron, which is less available during inflammation. IL-22 
induces the release of the antimicrobial protein lipocalin-2 (also known 
as neutrophil gelatinase-associated lipocalin) from the epithelium in 
mice and rhesus macaques'”*””’. Lipocalin-2 reduces iron availability 


© 2016 Macmillan Publishers Limited. All rights reserved. 


by binding to enterobactin, a low-molecular-weight iron chelator (or 
siderophore) that is produced by Enterobacteriaceae’”""". To overcome 
this, S. Typhimurium and some commensal E. coli secrete a glycosylated 
derivative of enterobactin, termed salmochelin, which is not bound by 
lipocalin-2 (ref. 108). By producing salmochelin as well as two further 
siderophores that are not bound by lipocalin-2, yersiniobactin and aero- 
bactin, the probiotic E. coli strain Nissle 1917 can limit the expansion of 
S. Typhimurium in the lumen of the inflamed gut’””. Conversely, lipoca- 
lin-2 secretion by the epithelium generates an environment that enables 
S. Typhimurium to edge out commensal Enterobacteriaceae that depend 
solely on enterobactin for the acquisition of iron’” (Fig. 3). 

Through its limitation of iron availability, intestinal inflammation also 
sets the stage for battles between Enterobacteriaceae that use protein- 
based toxins knownas colicins’” that affect a narrow range of hosts. Iron 
limitation induces the synthesis of siderophore receptor proteins for the 
bacterial outer membrane’, which also commonly serve as receptors 
for colicins'*”"®. Expression of a siderophore receptor protein termed 
the colicin I receptor (CirA) confers commensal E. coli with sensitivity 
to colicin Ib produced by S. Typhimurium". The respiratory nutrient 
niche that is generated by the inflammatory response of the host is there- 
fore a battleground on which commensal and pathogenic Enterobacte- 
riaceae struggle for dominance using a diverse arsenal of nutritional and 
antimicrobial strategies. 


Perspective and the future 

The study of the microbiome began more than a century ago. equenc- 
ing of 16S rRNA genes provided the first insights into the taxonomic 
composition of microbial communities. Later, sequencing of the com- 
plete metagenome of microbial communities provided a more detailed 
insight into the full genetic capacity of such a community. The use of 
germ-free animals, either alone or in combination with emerging tech- 
nologies such as laser-capture microdissection and transcriptomics, ena- 
bled mechanistic studies of the associations between the microbiota, the 
host and pathogenic bacteria"’. Multi-taxon insertion sequencing now 
allows researchers to investigate both the assembly and the shared and 
strain-specific dietary requirements of communities of microbes, and 
it has also facilitated the informed manipulation of such communities 
through diet''*. The development of quantitative imaging technologies 
has provided insight into the localization of microbes within the gastro- 
intestinal tract, and it has also enabled studies on the proximity of and 
interactions between microbes’. The increasing refinement and power 
of metabolomics, imaging mass spectrometry and three-dimensional 
mapping of mass-spectrometry data provide a high-resolution image of 
the complex chemistry landscape of the interactions between microbes 
and the host, which sets the stage for manipulating this chemistry to pre- 
vent or treat infectious diseases”***'°"””, A marriage of metagenomics 
and mathematical modelling promises to enhance the precision of 
microbiome reconstitution, which has proven successful for tackling 
C. difficile infections in mice’. In these exciting times, the expansion of 
multidisciplinary research is rapidly generating new technologies and 
mechanistic insights into interactions between the microbiota, the host 
and pathogenic bacteria (Box 1). = 


Received 4 September 2015; accepted 22 April 2016. 


1. Eckburg, P. B. et al. Diversity of the human intestinal microbial flora. Science 
308, 1635-1638 (2005). 

2. Frank, D.N. et a/. Molecular-phylogenetic characterization of microbial 
community imbalances in human inflammatory bowel diseases. Proc. Natl Acad. 
Sci. USA 104, 13780-13785 (2007). 

3. The Human Microbiome Project Consortium. Structure, function and diversity of 
the healthy human microbiome. Nature 486, 207-214 (2012). 

4. Ley, R. E, Peterson, D. A. & Gordon, J. |. Ecological and evolutionary forces 
shaping microbial diversity in the human intestine. Cel/ 124, 837-848 (2006). 

5. Yurist-Doutsch, S., Arrieta, M. C., Vogt, S. L. & Finlay, B. B. Gastrointestinal 
microbiota-mediated control of enteric pathogens. Annu. Rev. Genet. 48, 
361-382 (2014). 

6. Sassone-Corsi, M. & Raffatellu, M. No vacancy: how beneficial microbes 
cooperate with immunity to provide colonization resistance to pathogens. 
J. Immunol. 194, 4081-4087 (2015). 


13. 


14. 


15; 


16. 


17. 


18. 


19. 


20. 


21. 
22. 


23. 


24. 


25. 


26. 


27. 


28. 


29. 


30. 


31. 


32. 


33. 


34. 


35. 


REVIEW 


Cameron, E. A. & Sperandio, V. Frenemies: signaling and nutritional integration 
in pathogen-microbiota-host interactions. Cel! Host Microbe 18, 275-284 
(2015). 

Pacheco, A. R. & Sperandio, V. Enteric pathogens exploit the microbiota- 
generated nutritional environment of the gut. Microbiol. Spectr. 3, 
MBP-0001-2014 (2015). 

Bohnhoff, M., Drake, B. L. & Miller, C. P. Effect of streptomycin on susceptibility of 
intestinal tract to experimental Salmonella infection. Proc. Soc. Exp. Biol. Med. 86, 
132-137 (1954). 


. Ferreira, R. B. et a/. The intestinal microbiota plays a role in Salmonella-induced 


colitis independent of pathogen colonization. PLoS ONE 6, e20338 (2011). 


. Sprinz, H. et al. The response of the germfree guinea pig to oral bacterial 


challenge with Escherichia coli and Shigella flexneri. Am. J. Pathol. 39, 681-695 
(1961). 


. Zachar, Z. & Savage, D. C. Microbial interference and colonization of the murine 


gastrointestinal tract by Listeria monocytogenes. Infect. Immun. 23, 168-174 
(1979). 

Kamada, N. et a/. Regulated virulence controls the ability of a pathogen to 
compete with the gut microbiota. Science 336, 1325-1329 (2012). 

This study showed that enteric pathogens use virulence genes to compete 
with the microbiota for nutrients in the gut. 

Ghosh, S. et al. Colonic microbiota alters host susceptibility to infectious colitis 
by modulating inflammation, redox status, and ion transporter gene expression. 
Am. J. Physiol. Gastrointest. Liver Physiol. 301, G39-G49 (2011). 

Willing, B. P., Vacharaksa, A., Croxen, M., Thanachayanont, T. & Finlay, B. B. 
Altering host resistance to infections through microbial transplantation. PLoS 
ONE 6, e26988 (2011). 

Kampmann, C., Dicksved, J., Engstrand, L. & Rautelin, H. Composition of human 
faecal microbiota in resistance to Campylobacter infection. Clin. Microbiol. Infect. 
22, 61.e1-61.e8 (2016). 

Kau, A. L., Ahern, P. P., Griffin, N. W., Goodman, A. L. & Gordon, J. |. Human 
nutrition, the gut microbiome and the immune system. Nature 474, 327-336 
(2011). 

Zumbrun, S. D. et al. Dietary choice affects Shiga toxin-producing Escherichia 
coli (STEC) 0157:H7 colonization and disease. Proc. Natl Acad. Sci. USA 110, 
E2126-E2133 (2013). 

Wlodarska, M., Willing, B. P., Bravo, D. M. & Finlay, B. B. Phytonutrient diet 
supplementation promotes beneficial Clostridia species and intestinal mucus 
secretion resulting in protection against enteric infection. Sci. Rep. 5, 9253 
(2015). 

Modi, S. R., Collins, J. J. & Relman, D. A. Antibiotics and the gut microbiota. J. Clin. 
Invest. 124, 4212-4218 (2014). 

Leffler, D. A. & Lamont, J. T. Clostridium difficile infection. N. Engl. J. Med. 373, 
287-288 (2015). 

Pavia, A. T. et al. Epidemiologic evidence that prior antimicrobial exposure 
decreases resistance to infection by antimicrobial-sensitive Salmonella. J. Infect. 
Dis. 161, 255-260 (1990). 

Ng, K. M. et al. Microbiota-liberated host sugars facilitate post-antibiotic 
expansion of enteric pathogens. Nature 502, 96-99 (2013). 

This study showed that treatment with antibiotics enhances the abundance of 
host sialic acid that can be harvested by the microbiota, which promotes the 
expansion of enteric pathogens. 

Ferreyra, J. A. et al. Gut microbiota-produced succinate promotes C. difficile 
infection after antibiotic treatment or motility disturbance. Cell Host Microbe 16, 
770-777 (2014). 

Ferreyra, J. A. Ng, K. M. & Sonnenburg, J. L. The enteric two-step: nutritional 
strategies of bacterial pathogens within the gut. Cell. Microbiol. 16, 993-1003 
(2014). 

Rakoff-Nahoum, S., Coyne, M. J. & Comstock, L. E. An ecological network of 
polysaccharide utilization among human intestinal symbionts. Curr. Biol. 24, 
40-49 (2014). 

Alverdy, J., Chi, H. S. & Sheldon, G. F. The effect of parenteral nutrition on 
gastrointestinal immunity. The importance of enteral stimulation. Ann. Surg. 
202, 681-684 (1985). 

Fischbach, M. A. & Sonnenburg, J. L. Eating for two: how metabolism establishes 
interspecies interactions in the gut. Cell Host Microbe 10, 336-347 (2011). 
Chow, W. L. & Lee, Y. K. Free fucose is a danger signal to human intestinal 
epithelial cells. Br J. Nutr. 99, 449-454 (2008). 

Bourlioux, P,, Koletzko, B., Guarner, F. & Braesco, V. The intestine and its 
microflora are partners for the protection of the host: report on the Danone 
Symposium “The Intelligent Intestine,” held in Paris, June 14, 2002. Am. J. Clin. 
Nutr. 78, 675-683 (2003). 

Bry, L, Falk, P. G., Midtvedt, T. & Gordon, J. |. A model of host-microbial 
interactions in an open mammalian ecosystem. Science 273, 1380-1383 
(1996). 

Hooper, L. V., Xu, J., Falk, P. G., Midtvedt, T. & Gordon, J. |. A molecular sensor 
that allows a gut commensal to control its nutrient foundation in a competitive 
ecosystem. Proc. Natl Acad. Sci. USA 96, 9833-9838 (1999). 

Fabich, A. J. et al. Comparison of carbon nutrition for pathogenic and 
commensal Escherichia coli strains in the mouse intestine. /nfect. /mmun. 76, 
1143-1152 (2008). 

Autieri, S. M. et al. L-fucose stimulates utilization of D-ribose by Escherichia coli 
MG1655 AfucAO and E. coli Nissle 1917 AfucAO mutants in the mouse intestine 
and in M9 minimal medium. Infect. Immun. 75, 5465-5475 (2007). 

Pacheco, A. R. et al. Fucose sensing regulates bacterial intestinal colonization. 
Nature 492, 113-117 (2012). 

This paper showed that sugars from the mucus that are released by the 


7 JULY 2016 | VOL 535 | NATURE | 91 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


36. 
37. 


38. 


39. 
40. 
4l. 


42. 
43. 


44. 


45. 


46. 
47. 
48. 


49. 


50. 


51. 
52. 
53: 
54. 


55. 
56. 
57. 
58. 


59. 
60. 


61. 
62. 


63. 


64. 


92 


microbiota can be perceived as signals by enteric pathogens to regulate the 
expression of virulence genes. 

Schauer, D. B. & Falkow, S. Attaching and effacing locus of a Citrobacter freundii 
biotype that causes transmissible murine colonic hyperplasia. Infect. Immun. 61, 
2486-2492 (1993). 

Szabady, R. L., Lokuta, M. A., Walters, K. B., Huttenlocher, A. & Welch, R. A. 
Modulation of neutrophil function by a secreted mucinase of Escherichia coli 
0157:H7. PLoS Pathog. 5, €1000320 (2009). 

Curtis, M. M. et al. The gut commensal Bacteroides thetaiotaomicron exacerbates 
enteric infection through modification of the metabolic landscape. Cell Host 
Microbe 16, 759-769 (2014). 

This study revealed that metabolites that are produced by the microbiota can 
be exploited as cues to increase the virulence of enteric pathogens. 

Macy, J. M., Ljungdahl, L. G. & Gottschalk, G. Pathway of succinate and 
propionate formation in Bacteroides fragilis. J. Bacteriol. 134, 84-91 (1978). 
Carter, P. B. & Collins, F. M. The route of enteric infection in normal mice. J. Exp. 
Med. 139, 1189-1203 (1974). 

Lawhon, S. D., Maurer, R., Suyemoto, M. & Altier, C. Intestinal short-chain fatty 
acids alter Salmonella typhimurium invasion gene expression and virulence 
through BarA/SirA. Mol. Microbiol. 46, 1451-1464 (2002). 

Takao, M., Yen, H. & Tobe, T. LeuO enhances butyrate-induced virulence 
expression through a positive regulatory loop in enterohaemorrhagic Escherichia 
coli. Mol. Microbiol. 93, 1302-1313 (2014). 

Karmali, M. A., Petric, M., Lim, C., Fleming, P. C. & Steele, B. T. Escherichia coli 
cytotoxin, haemolytic-uraemic syndrome, and haemorrhagic colitis. Lancet 2, 
1299-1300 (1983). 

Fukuda, S. et al. Bifidobacteria can protect from enteropathogenic infection 
through production of acetate. Nature 469, 543-547 (2011). 

This study showed that the acetate that is produced by certain probiotic 
bacteria can enhance the barrier function of the gut epithelium to protect the 
host from enteric infections. 

Bertin, Y. et al. Enterohaemorrhagic Escherichia coli gains a competitive 
advantage by using ethanolamine as a nitrogen source in the bovine intestinal 
content. Environ. Microbiol. 13, 365-377 (2011). 

Garsin, D. A. Ethanolamine utilization in bacterial pathogens: roles and 
regulation. Nature Rev. Microbiol. 8, 290-295 (2010). 

Korbel, J. O. et al. Systematic association of genes to phenotypes by genome and 
literature mining. PLoS Biol. 3, e134 (2005). 

Thiennimitr, P. et a/. Intestinal inflammation allows Salmonella to use 
ethanolamine to compete with the microbiota. Proc. Natl Acad. Sci. USA 108, 
17480-17485 (2011). 

Joseph, B. et al. Identification of Listeria monocytogenes genes contributing 

to intracellular replication by expression profiling and mutant screening. 

J. Bacteriol. 188, 556-568 (2006). 

Kendall, M. M., Gruber, C. C., Parker, C. T. & Sperandio, V. Ethanolamine controls 
expression of genes encoding components involved in interkingdom signaling 
and virulence in enterohemorrhagic Escherichia coli0157:H7. mBio 3, 00050- 
12 (2012). 

Anderson, C. J., Clark, D. E., Adli, M. & Kendall, M. M. Ethanolamine signaling 
promotes Salmonella niche recognition and adaptation during infection. PLoS 
Pathog. 11, e1005278 (2015). 

Maier, L. et al. Microbiota-derived hydrogen fuels Salmonella Typhimurium 
invasion of the gut ecosystem. Cell Host Microbe 14, 641-651 (2013). 

Maltby, R., Leatham-Jensen, M. P,, Gibson, T., Cohen, P. S. & Conway, T. Nutritional 
basis for colonization resistance by human commensal Escherichia coli strains 
HS and Nissle 1917 against E. coli0157:H7 in the mouse intestine. PLoS ONE 8, 
e53957 (2013). 

Fabich, A. J. et al. Comparison of carbon nutrition for pathogenic and 
commensal Escherichia coli strains in the mouse intestine. /nfect. Immun. 76, 
1143-1152 (2008). 

Hsiao, A. et al. Members of the human gut microbiota involved in recovery from 
Vibrio cholerae infection. Nature 515, 423-426 (2014). 

Hughes, D. T. et a/. Chemical sensing in mammalian host-bacterial commensal 
associations. Proc. Natl Acad. Sci. USA 107, 9831-9836 (2010). 

Bachmann, V. et al. Bile salts modulate the mucin-activated type VI secretion 
system of pandemic Vibrio cholerae. PLoS Negl. Trop. Dis. 9, €0004031 (2015). 
Horger, S., Schultheiss, G. & Diener, M. Segment-specific effects of epinephrine 
on ion transport in the colon of the rat. Am. J. Physiol. 275, G1367-G1376 
(1998). 

Yano, J. M. et al. Indigenous bacteria from the gut microbiota regulate host 
serotonin biosynthesis. Ce// 161, 264-276 (2015). 

Asano, Y. et al. Critical role of gut microbiota in the production of biologically 
active, free catecholamines in the gut lumen of mice. Am. J. Physiol. Gastrointest. 
Liver Physiol. 303, G1288-G1295 (2012). 

Furness, J. B. Types of neurons in the enteric nervous system. J. Auton. Nerv. Syst. 
81, 87-96 (2000). 

Curtis, M. M. & Sperandio, V. A complex relationship: the interaction among 
symbiotic microbes, invading pathogens, and their mammalian host. Mucosal 
Immunol. 4, 133-138 (2011). 

Moreira, C. G., Weinshenker, D. & Sperandio, V. QseC mediates Salmonella 
enterica serovar Typhimurium virulence in vitro and in vivo. Infect. Immun. 78, 
914-926 (2010). 

Sperandio, V., Torres, A. G., Jarvis, B., Nataro, J. P. & Kaper, J. B. Bacteria-host 
communication: the language of hormones. Proc. Nat! Acad. Sci. USA 100, 
8951-8956 (2003). 

This paper describes how signals from the microbiota and host converge to 


| NATURE | VOL 535 | 7 JULY 2016 


65. 


66. 


67. 


68. 
69. 


70. 


71. 


72. 


73. 


74, 


75, 


76. 


77. 


78. 


79. 


80. 


81. 


82. 


83. 


84. 


85. 


86. 


87. 


88. 


89. 


90. 


91. 


enhance virulence in enteric pathogens. 

Nakano, M., Takahashi, A., Sakai, Y. & Nakaya, Y. Modulation of pathogenicity 
with norepinephrine related to the type Ill secretion system of Vibrio 
parahaemolyticus. J. Infect. Dis. 195, 1353-1360 (2007). 

Clarke, M. B., Hughes, D. T., Zhu, C., Boedeker, E. C. & Sperandio, V. The QseC 
sensor kinase: a bacterial adrenergic receptor. Proc. Natl Acad. Sci. USA 103, 
10420-10425 (2006). 
Reading, N. C., Rasko, D. A., Torres, A. G. & Sperandio, V. The two-component 
system QseEF and the membrane protein QseG link adrenergic and stress 
sensing to bacterial pathogenesis. Proc. Nat! Acad. Sci. USA 106, 5889-5894 
(2009). 
Seksik, P. et al. Alterations of the dominant faecal bacterial groups in patients 
with Crohn’s disease of the colon. Gut 52, 237-242 (2003). 

Gophna, U., Sommerfeld, K., Gophna, S., Doolittle, W. F. & Veldhuyzen van 
Zanten, S. J. Differences between tissue-associated intestinal microfloras 

of patients with Crohn’s disease and ulcerative colitis. J. Clin. Microbiol. 44, 
4136-4141 (2006). 

Baumgart, M. et a/. Culture independent analysis of ileal mucosa reveals a 
selective increase in invasive Escherichia coli of novel phylogeny relative to 
depletion of Clostridiales in Crohn’s disease involving the ileum. /SME J. 1, 
403-418 (2007). 

Walker, A. W. et a/. High-throughput clone library analysis of the mucosa- 
associated microbiota reveals dysbiosis and differences between inflamed 

and non-inflamed regions of the intestine in inflammatory bowel disease. BMC 
Microbiol. 11, 7 (2011). 

Gevers, D. et al. The treatment-naive microbiome in new-onset Crohn's disease. 
Cell Host Microbe 15, 382-392 (2014). 

Chiodini, R. J. et al. Microbial population differentials between mucosal and 
submucosal intestinal tissues in advanced Crohn’s disease of the ileurn. PLoS 
ONE 10, e0134382 (2015). 

Lupp, C. et al. Host-mediated inflammation disrupts the intestinal microbiota 
and promotes the overgrowth of Enterobacteriaceae. Cell Host Microbe 2, 
119-129; corrigendum 2, 204 (2007). 

This paper reported that the induction of inflammation in the host disrupts the 
microbiota and promotes a bloom of Enterobacteriaceae. 

Garrett, W. S. et al. Enterobacteriaceae act in concert with the gut microbiota 

to induce spontaneous and maternally transmitted colitis. Cel/ Host Microbe 8, 
292-300 (2010). 

Raffatellu, M. et al. Simian immunodeficiency virus-induced mucosal 
interleukin-17 deficiency promotes Salmonella dissemination from the gut. 
Nature Med. 14, 421-428 (2008). 

Godinez, |. et al. T cells help to amplify inflammatory responses induced by 
Salmonella enterica serotype Typhimurium in the intestinal mucosa. Infect. 
Immun. 76, 2008-2017 (2008). 

Pickard, J. M. et al. Rapid fucosylation of intestinal epithelium sustains host- 
commensal symbiosis in sickness. Nature 514, 638-641 (2014). 

This paper showed that cytokines that are induced by enteric infection 
promote fucosylation in the host. 

Pham, T. A. et a/. Epithelial IL-22RA1-mediated fucosylation promotes intestinal 
colonization resistance to an opportunistic pathogen. Cell Host Microbe 16, 
504-516 (2014). 

Sonnenburg, J. L. et al. Glycan foraging in vivo by an intestine-adapted bacterial 
symbiont. Science 307, 1955-1959 (2005). 

Ansong, C. et al. Studying Salmonellae and Yersiniae host-pathogen interactions 
using integrated ’omics and modeling. Curr. Top. Microbiol. Immunol. 363, 21-41 
(2013). 

Harper, R. W. et a/. Differential regulation of dual NADPH oxidases/peroxidases, 
Duox1 and Duox2, by Th1 and Th2 cytokines in respiratory tract epithelium. 
FEBS Lett. 579, 4911-4917 (2005). 

Haberman, Y. et a/. Pediatric Crohn disease patients exhibit specific ileal 
ranscriptome and microbiome signature. J. Clin. Invest. 124, 3617-3633 
(2014). 

Salzman, A. et al. Induction and activity of nitric oxide synthase in cultured 
human intestinal epithelial monolayers. Am. J. Physiol. 270, G565-G573 (1996). 
Palmer, R. M., Rees, D. D., Ashton, D. S. & Moncada, S. L-arginine is the 
physiological precursor for the formation of nitric oxide in endothelium- 
dependent relaxation. Biochem. Biophys. Res. Commun. 153, 1251-1256 
(1988). 

Lundberg, J. O., Weitzberg, E., Lundberg, J. M. & Alving, K. Intragastric nitric 
oxide production in humans: measurements in expelled air. Gut 35, 1543-1546 
(1994). 

Singer, |. |. et al. Expression of inducible nitric oxide synthase and nitrotyrosine 
in colonic epithelium in inflammatory bowel disease. Gastroenterology 111, 
871-885 (1996). 

Enocksson, A., Lundberg, J., Weitzberg, E., Norrby-Teglund, A. & Svenungsson, B. 
Rectal nitric oxide gas and stool cytokine levels during the course of infectious 
gastroenteritis. Clin. Diagn. Lab. Immunol. 11, 250-254 (2004). 

Bai, P. et al. Protein tyrosine nitration and poly(ADP-ribose) polymerase 
activation in N-methyl-N-nitro-N-nitrosoguanidine-treated thymocytes: 
implication for cytotoxicity. Toxicol. Lett. 170, 203-213 (2007). 

Dudhgaonkar, S. P., Tandan, S. K., Kumar, D., Raviprakash, V. & Kataria, M. 
Influence of simultaneous inhibition of cyclooxygenase-2 and inducible nitric 
oxide synthase in experimental colitis in rats. Inflammopharmacology 15, 
188-195 (2007). 

Winter, S. E. et a/. Host-derived nitrate boosts growth of E. coli in the inflamed gut. 
Science 339, 708-711 (2013). 


© 2016 Macmillan Publishers Limited. All rights reserved. 


This study revealed that nitrate respiration drives the expansion of commensal 
E. coli during colitis. 

92. Winter, S. E. & Baumler, A. J. Why related bacterial species bloom simultaneously 
in the gut: principles underlying the ‘like will to like’ concept. Cell. Microbiol. 16, 
179-184 (2014). 

93. Rivera-Chavez, F. & Baumler, A. J. The pyromaniac inside you: Salmonella 
metabolism in the host gut. Annu. Rev. Microbiol. 69, 31-48 (2015). 

94. Galan, J. E. & Curtiss, R., Ill. Cloning and molecular characterization of genes 
whose products allow Salmonella typhimurium to penetrate tissue culture cells. 
Proc. Natl Acad. Sci. USA 86, 6383-6387 (1989). 

95. Hensel, M. et al. Simultaneous identification of bacterial virulence genes by 
negative selection. Science 269, 400-403 (1995). 

96. Tsolis, R.M., Adams, L. G., Ficht, T. A. & Baumler, A. J. Contribution of Salmonella 
typhimurium virulence factors to diarrheal disease in calves. Infect. Immun. 67, 
4879-4885 (1999). 

97. Barthel, M. et al. Pretreatment of mice with streptomycin provides a Salmonella 
enterica serovar Typhimurium colitis model that allows analysis of both 
pathogen and host. Infect. Immun. 71, 2839-2858 (2003). 

98. Coburn, B., Li, Y., Owen, D., Vallance, B. A. & Finlay, B. B. Salmonella enterica 
serovar Typhimurium pathogenicity island 2 is necessary for complete virulence 
in a mouse model of infectious enterocolitis. Infect. Immun. 73, 3219-3227 
(2005). 

99. Stecher, B. et a/. Salmonella enterica serovar Typhimurium exploits inflammation 

to compete with the intestinal microbiota. PLoS Biol. 5, 2177-2189 (2007). 
00.Lawley, T. D. et a/. Host transmission of Salmonella enterica serovar Typhimurium 
is controlled by virulence factors and indigenous intestinal microbiota. /nfect. 

Immun. 76, 403-416 (2008). 

-Lopez, C. A. et al. Phage-mediated acquisition of a type Ill secreted effector 

protein boosts growth of Salmonella by nitrate respiration. mBio 3, e00143-12 

(2012). 

02.Lopez, C. A., Rivera-Chavez, F., Byndloss, M. X. & Baumler, A. J. The periplasmic 

nitrate reductase NapABC supports luminal growth of Salmonella enterica 

serovar Typhimurium during colitis. Infect. Immun. 83, 3470-3478 (2015). 

03.Levitt, M. D., Furne, J., Springfield, J., Suarez, F. & DeMaster, E. Detoxification 

of hydrogen sulfide and methanethiol in the cecal mucosa. J. Clin. Invest. 104, 
1107-1114 (1999). 
04. Harris, J. C., Dupont, H. L. & Hornick, R. B. Fecal leukocytes in diarrheal illness. 
Ann. Intern. Med. 76, 697-703 (1972). 
05.Loetscher, Y. et al. Salmonella transiently reside in luminal neutrophils in the 
inflamed gut. PLoS ONE 7, e34812 (2012). 
06.Winter, S. E. et a/. Gut inflammation provides a respiratory electron acceptor for 
Salmonella. Nature 467, 426-429 (2010). 
This paper showed that tetrathionate is a unique electron acceptor that is 
induced through inflammation that promotes expansion of S. Typhimurium. 
07.Muller, H. J. Partial list of biological institutes and biologists doing experimental 
work in Russia at the present time. Science 57, 472-473 (1923). 
08.Raffatellu, M. et a/. Lipocalin-2 resistance confers an advantage to Salmonella 
enterica serotype Typhimurium for growth and survival in the inflamed intestine. 
Cell Host Microbe 5, 476-486 (2009). 
09.Behnsen, J. et al. The cytokine IL-22 promotes pathogen colonization by 
suppressing related commensal bacteria. /mmunity 40, 262-273 (2014). 
0.Goetz, D. H. et al. The neutrophil lipocalin NGAL is a bacteriostatic agent that 
interferes with siderophore-mediated iron acquisition. Mol. Cel/ 10, 1033-1043 
(2002). 
1.Flo, T. H. et al. Lipocalin 2 mediates an innate immune response to bacterial 
infection by sequestrating iron. Nature 432, 917-921 (2004). 
2.Deriu, E. et a/. Probiotic bacteria reduce Salmonella typhimurium intestinal 
colonization by competing for iron. Cell Host Microbe 14, 26-37 (2013). 
This paper revealed that probiotic strains of E. coli restrict infection of the gut 
with S. Typhimurium by competing for sources of iron. 
3.Cascales, E. et al. Colicin biology. Microbiol. Mol. Biol. Rev. 71, 158-229 (2007). 
4.Guterman, S. K. Colicin B: mode of action and inhibition by enterochelin. 
J. Bacteriol. 114, 1217-1224 (1973). 
5.Cardelli, J. & Konisky, J. lsolation and characterization of an Escherichia coli 
mutant tolerant to colicins la and Ib. J. Bacteriol. 119, 379-385 (1974). 

6.Patzer, S. |., Baquero, M. R., Bravo, D., Moreno, F. & Hantke, K. The colicin G, 

H and X determinants encode microcins M and H47, which might utilize the 
catecholate siderophore receptors FepA, Cir, Fiu and IroN. Microbiology 149, 
2557-2570 (2003). 

7.Hooper, L. V. et a/. Molecular analysis of commensal host-microbial relationships 

in the intestine. Science 291, 881-884 (2001). 
This was the first report to show that the microbiota can change the expression 
of mammalian genes to modulate the immune system of the host. 

8.Wu, M. et al. Genetic determinants of in vivo fitness and diet responsiveness in 

multiple human gut Bacteroides. Science 350, aac5992 (2015). 
9. Earle, K. A. et al. Quantitative imaging of gut microbiota spatial organization. Cell 
Host Microbe 18, 478-488 (2015). 
20.Bouslimani, A. et al. Molecular cartography of the human skin surface in 3D. 
Proc. Natl Acad. Sci. USA 112, E2120-E2129 (2015). 


0 


= 


REVIEW 


121.Marcobal, A. et al. A metabolomic view of how the human gut microbiota 
impacts the host metabolome using humanized and gnotobiotic mice. ISME J. 7, 
1933-1943 (2013). 

122.Lau, W., Fischbach, M. A., Osbourn, A. & Sattely, E. S. Key applications of plant 
metabolic engineering. PLoS Biol. 12, €1001879 (2014). 

123.Marcobal, A. et al. Metabolome progression during early gut microbial 
colonization of gnotobiotic mice. Sci. Rep. 5, 11589 (2015). 

124. Rath, C. M. et a/. Molecular analysis of model gut microbiotas by imaging mass 
spectrometry and nanodesorption electrospray ionization reveals dietary 
metabolite transformations. Anal. Chem. 84, 9259-9267 (2012). 

This study used imaging mass spectrometry to map differences in the gut 
metabolic landscape that are promoted by the microbiota. 

125.Dorrestein, P. C., Mazmanian, S. K. & Knight, R. Finding the missing links among 
metabolites, microbes, and the host. Immunity 40, 824-832 (2014). 

126.Antunes, L. C. et al. Antivirulence activity of the human gut metabolome. mBio 5, 
e01183-14 (2014). 

127.Antunes, L. C. et al. Impact of Salmonella infection on host hormone metabolism 
revealed by metabolomics. Infect. Immun. 79, 1759-1769 (2011). 

128. Buffie, C. G. et al. Precision microbiome reconstitution restores bile acid 
mediated resistance to Clostridium difficile. Nature 517, 205-208 (2015). 

This paper showed that the introduction of a single species of bacteria that can 
produce secondary bile salts confers the host with resistance to the expansion 
of C. difficile after treatment with antibiotics. 

129.Koenigsknecht, M. J. & Young, V. B. Faecal microbiota transplantation for the 
treatment of recurrent Clostridium difficile infection: current promise and future 
needs. Curr. Opin. Gastroenterol. 29, 628-632 (2013). 

130.Sekirov, |. et al. Antibiotic-induced perturbations of the intestinal microbiota alter 
host susceptibility to enteric infection. Infect. Immun. 76, 4726-4736 (2008). 

131.Louis, E., Libioulle, C., Reenaers, C., Belaiche, J. & Georges, M. Genomics of 
inflammatory bowel diseases: basis for a new molecular classification and new 
therapeutic strategies of these diseases [in French]. Rev. Med. Liege 64 S1, 
24-28 (2009). 

132.Meynell, G. G. Antibacterial mechanisms of the mouse gut. Il. The role of Eh and 
volatile fatty acids in the normal gut. Br. J. Exp. Pathol. 44, 209-219 (1963). 

133.Donohoe, D. R., Wali, A., Brylawski, B. P. & Bultman, S. J. Microbial regulation 
of glucose metabolism and cell-cycle progression in mammalian colonocytes. 
PLoS ONE 7, e46589 (2012). 

134.Atarashi, K., Umesaki, Y. & Honda, K. Microbiotal influence on T cell subset 
development. Semin. Immunol. 23, 146-153 (2011). 

135.Atarashi, K. et a/. T,.. induction by a rationally selected mixture of Clostridia 
strains from the human microbiota. Nature 500, 232-236 (2013). 

136.Furusawa, Y. et al. Commensal microbe-derived butyrate induces the 
differentiation of colonic regulatory T cells. Nature 504, 446-450 (2013). 

137.Spees, A. M. et al. Streptomycin-induced inflammation enhances Escherichia coli 
gut colonization through nitrate respiration. mBio 4, eE00430-13 (2013). 

138. Itoh, K. & Freter, R. Control of Escherichia coli populations by a combination of 
indigenous clostridia and lactobacilli in gnotobiotic mice and continuous-flow 
cultures. Infect. Immun. 57, 559-565 (1989). 

139.Wells, J. E. & Hylemon, P. B. Identification and characterization of a bile acid 
7a-dehydroxylation operon in Clostridium sp. strain TO-931, a highly active 
7a-dehydroxylating strain isolated from human feces. Appl. Environ. Microbiol. 
66, 1107-1113 (2000). 

140.Wilson, K. H. Efficiency of various bile salt preparations for stimulation of 
Clostridium difficile spore germination. J. Clin. Microbiol. 18, 1017-1019 (1983). 

141.Sorg, J. A. & Sonenshein, A. L. Bile salts and glycine as cogerminants for 
Clostridium difficile spores. J. Bacteriol. 190, 2505-2512 (2008). 

142. Theriot, C. M. et al. Antibiotic-induced shifts in the mouse gut microbiome and 
metabolome increase susceptibility to Clostridium difficile infection. Nature 
Commun. 5, 3114 (2014). 

143.Weingarden, A. R. et al. Microbiota transplantation restores normal fecal bile acid 
composition in recurrent Clostridium difficile infection. Am. J. Physiol. Gastrointest. 
Liver Physiol. 306, G310-G319 (2014). 

144. Bernstein, H., Bernstein, C., Payne, C. M., Dvorakova, K. & Garewal, H. Bile acids 
as carcinogens in human gastrointestinal cancers. Mutat. Res. 589, 47-65 
(2005). 


Acknowledgements Work in the V.S. laboratory is supported by US National 
Institutes of Health (NIH) grants Al053067, Al077613, Al05135 and Al114511. 

Work in the A.J.B. laboratory is supported by US Department of Agriculture grant 
2015-67015-22930 and NIH grants Al044170, Al096528, Al112445, Al114922 and 
Al117940. The contents of this Review are solely the responsibility of the authors and 
do not necessarily represent the official views of the NIH National Institute of Allergy 
and Infectious Diseases. 


Author information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of this 
paper at go.nature.com/28ix4vg. Correspondence should be addressed to V.S. 
(vanessa.sperandio@utsouthwestern.edu). 


7 JULY 2016 | VOL 535 | NATURE | 93 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


doi:10.1038/nature18850 


Microbiome-wide association studies link 
dynamic microbial consortia to disease 


Jack A. Gilbert’, Robert A. Quinn? “, Justine Debelius®, Zhenjiang Z. Xu°, James Morton®, Neha Garg”’, Janet K. Jansson’, 


Pieter C. Dorrestein? * & Rob Knight* ° 


Rapid advances in DNA sequencing, metabolomics, proteomics and computational tools are dramatically increasing access 
to the microbiome and identification of its links with disease. In particular, time-series studies and multiple molecular 
perspectives are facilitating microbiome-wide association studies, which are analogous to genome-wide association 
studies. Early findings point to actionable outcomes of microbiome-wide association studies, although their clinical 
application has yet to be approved. An appreciation of the complexity of interactions among the microbiome and the host’s 
diet, chemistry and health, as well as determining the frequency of observations that are needed to capture and integrate 
this dynamic interface, is paramount for developing precision diagnostics and therapies that are based on the microbiome. 


he role of individual species of microbes in infectious disease has 

been known since the work of microbiologists Robert Koch and 

Louis Pasteur in the nineteenth century. Yet the part played by com- 
plex communities of microbes (known as microbiotas) in providing fer- 
tile ground for infections and in setting the stage for non-communicable 
diseases has been appreciated only in the past decade. The gut microbiota, 
for example, has been linked to a variety of conditions, some of which are 
predictable (irritable bowel syndrome’ and inflammatory bowel disease 
(IBD) in adults’ and children’), whereas others are intriguing (obesity*”, 
cardiovascular disease®, colon cancer’ and rheumatoid arthritis*) or truly 
surprising (major depression’, Parkinson's disease” and autism spectrum 
disorder’). 

Many ways in which the microbiota might drive disease have been 
identified, but their relative importance is yet to be determined. For 
instance, the taxonomic composition of the microbiota might be most 
important, and this could be influenced by the overall diversity of spe- 
cies or by the presence of particular taxa, either of which can distinguish 
healthy individuals from those with disease states. If the collective genes of 
the microbiota (the microbiome) are more important, the overall genetic 
diversity or genetic composition, or even specific genetic lineages or 
metabolic pathways, might play a crucial part in shaping a disease. How 
such genes are expressed as transcripts and proteins could also have an 
effect. If the metabolome — the set of chemicals produced by the micro- 
biota and host — is of overriding concern, whether different communi- 
ties of microbes could lead to the same metabolic and immunological 
consequences should be considered. Overall, the molecular states of the 
microbiome probably interact through myriad feedback mechanisms 
that constantly respond and react to one another to produce the observed 
disease outcomes. 

This Review describes the ways in which the microbiota and the micro- 
biome, as well as specific functions of both, have been linked to various 
diseases. It also looks at some of the technical and conceptual pitfalls that 
must be avoided when designing studies that investigate these links. Such 
issues become compounded when studies are scaled up to cover tens of 
thousands of people over time and when they are designed to understand 
subtle and systems-level effects that result from the interactions of many 


factors. Microbiome-wide association studies (MWAS)”, which capture 
this scale and the multidimensional interactions, and provide a means 
of capturing complex interactions to predict practicable links between 
microbial systems and disease states. MWAS can link whole microbiomes 
or their features to phenotypes such as disease, with appropriate controls 
for composition of the microbiota and unusual statistical characteristics 
of microbiome data sets. Although MWAS are somewhat analogous to 
genome-wide association studies (GWAS), the microbiome contains 
many more genes than does the host genome, and its composition changes 
over time within a person (Box 1). MWAS are useful for untangling the 
mechanisms that link communities of microbes and their functions to 
disease, although most clinical applications are yet to be fully realized. 
To achieve this, model systems should be devised and implemented that 
allow the testing of hypotheses on isolated and combinatorial functions 
of microbes and interventions for capturing mechanisms of action. Such 
systems should also enable these ideas to be applied more generally to the 
complex communities of microbes that inhabit the body. 


Microbial biomarkers 

The human microbiota is the collection of microscopic organisms that 
live in the body, and it contains representatives from all domains of life: 
the archaea, the bacteria and the eukarya. Viruses, including bacterio- 
phages, are not always encompassed by the definition of the microbiota. 
They probably should be, however, because they can shape the structure 
of the community through top-down ecological control and they have 
their own effects on the immune system of the host'’. Most approaches to 
identifying microbial changes have applied biomarker discovery to test for 
differences between people with the conditions of interest and controls. 
Changes in the structure of the microbiota that are associated with disease 
states can occur at any taxonomic rank and along any relevant branch of 
the phylogenetic tree. For example, changes at the phylum level have been 
reported in human obesity” and IBD’, and strain-level associations have 
been made with the metabolism of drugs in humans“. For instance, the 
risk of colon cancer in mice increases in the presence of particular strains 
of Escherichia coli that express a gene cluster that produces a genotoxic 
secondary metabolite called colibactin’*. Between these extremes, changes 


‘Department of Surgery, University of Chicago, Chicago, Illinois 60637, USA. Department of Pharmacology, University of California San Diego, La Jolla, California 92093, USA. *Collaborative Mass 
Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California 92093, USA. “Center for Microbiome Innovation, 
Jacobs School of Engineering, University of California, San Diego, La Jolla, California 92093, USA. Department of Pediatrics, University of California, San Diego School of Medicine, La Jolla, 

California 92093, USA. “Department of Computer Science and Engineering, Jacobs School of Engineering, University of California San Diego, La Jolla, California 92093, USA. ‘Earth and Biological 


Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington 99354, USA. 


94 | NATURE | VOL 535 | 7 JULY 2016 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


BOX1 


Microbiome-wide association studies (MWAS) are similar in concept 
to GWAS: the goal of both is to link a complex collection of features 
(for example, species or genes) to phenotype. However, there are 
important differences between the two. First, there are many more 
microbial genes than human ones, with some studies estimating 

that there are more than 100 microbial genes for every human 
gene**!!1-113_ Consequently, the issue of multiple comparisons is of 
greater importance to MWAS. Second, all individuals share almost the 
same collection of human genes but their dissimilarity in microbial 
species and microbial genes is much greater**!"*. Third, genes in the 
human genome can be counted easily but most microbiome data 
comes in the form of relative abundance. Compositional statistics 
therefore apply and the data cannot be represented in familiar 
Euclidean spaces. As a result, microbiome analyses are very prone 

to misinterpretation. For instance, it is impossible to infer the growth 
or decay of microbes purely on the basis of relative abundance data 
because the growth of one species could also be explained by the 
decline of all other species. Last, whereas the human genome is 
essentially fixed within an individual (except in special cases such 

as the immune system and cancer), the microbiome of each person 
changes profoundly throughout his or her lifetime. Several designs for 
MWAS link the overall microbiome to specific phenotypes. A number of 
important questions must therefore be asked when designing MWAS. 


@ At what level will the microbiome or microbiota be assessed? 
MWAS can be carried out using species, genes, functional categories 
of genes or, less frequently, transcripts and proteins as features. 
Metabolome-wide association studies are also possible, and they 

can be carried out at the level of individual spectra, groups of related 
spectra or pathways. These analyses often give different results; for 
example, in the Human Microbiome Project, pathway-level analysis of 
the shotgun metagenomic data suggested that much less variability 
existed between people than did taxon-level analysis. 

@ Will the microbiome be examined in terms of overall variation 

or as a collection of individual features? Techniques for reducing 

the dimensionality of the microbiome include: clustering, principal 
coordinates analysis (PCoA) with a variety of distance metrics, 
principal component analysis, correspondence analysis, factor analysis 
and discriminant analysis. In clustering analyses, which include 
enterotyping, samples are grouped into clusters. The resulting clusters 
are then tested for association with a phenotype (for example, whether 
the resting levels of blood glucose are identical in each cluster). During 
dimensionality reduction, one or more axes are discovered through 

a supervised or an unsupervised approach, and the dependence of 


Principles of microbiome-wide association studies 


phenotype on locations along these axes is tested, for example, by 
correlation approaches. Supervised approaches such as discriminant 
analysis make use of phenotype labels and provide the projection 
of the data that best separates these class labels. Statistical tests of 
location on the resulting axes must therefore be used with caution 
because even small departures from the random model can lead to 
apparent separation when there is none. Unsupervised approaches 
such as PCoA use only the intrinsic similarities and differences in the 
samples; however, they may not reveal separation by phenotypic state 
even when it exists (because it could come only in later principal axes). 
Techniques for associating individual features of the microbiome 
with phenotype, including appropriate statistics for repeated 
measures, are Metastats'!*, DESeq2 (ref. 115) and ANCOM (analysis of 
composition of microbiomes)!!°, as well as various machine-learning 
approaches such as Random Forests?”. Unfortunately, it is also 
challenging to infer differentially significant species in compositional 
data sets. Many state-of-the-art tools make assumptions about the 
underlying data to identify significantly different species. Analysts need 
to gauge the assumptions given by each tool before applying them 
to their data sets because these assumptions are typically not true of 
real-world data. 
@ What corrections will be performed for multiple statistical 
comparisons, sparsity and compositionality of the data, and other 
features of microbiome and related data sets? Often, associations 
will be sought between the microbiome (as a whole or as a collection 
of features) and measures of phenotype. In many of these studies, 
the differences between phenotypes can be described by a select 
few features. Conventional statistical tests can be confounded by the 
underlying ecology. For instance, multiple microbes can share the 
same functional roles. As a result, differences in microbial abundances 
might yield the same phenotype. Analyses should be separated 
into planned analyses (those chosen before the analysis) and 
ad hoc analyses (those performed after); ad hoc analyses should be 
considered to be exploratory rather than formal statistical tests. 
@ How will causality be established? Causality can be approached 
in a number of ways: through prospective longitudinal studies that 
demonstrate that a microbial or metabolic change precedes the 
disease phenotype; the demonstration that a clinical manipulation 
of the microbiome affects the disease process; preclinical work in 
mice or other animal models that demonstrates the plausibility of a 
mechanism; or establishment of the activity of chemical products of 
the microbiome that are linked to specific microbes or the genes that 
produce them. Studies that combine animal models with proof-of- 
relevance in people are especially effective, although they are rare. 


at the genus level are useful for many applications, including microbial 
source tracking" and, more controversially, defining enterotypes, which 
are classifications of types of microbial communities in the gut”. 


Taxonomic biomarkers 

Most studies have focused on identifying single organisms as biomarkers, 
but separating collections of samples on the basis of similarity between 
communities has also been useful for a wide range of diseases, including 
IBD". However, the extent to which the choice of metric for pairwise 
comparisons of communities can influence the result is not widely appre- 
ciated. The fit between the data and a statistical model is often used to 
assess the validity of the technique. But when a collection of samples is 
highly heterogeneous, which includes situations as simple as the collec- 
tion of skin samples from different people, models that better fit the data 
in the original data set might provide no clear biological interpretation. 


Importantly, this problem cannot be overcome by collecting more data 
because using the incorrect statistical model can obscure results that 
can be clearly determined, even with limited numbers of DNA or RNA 
sequences”, The choice of distance metric, level of taxonomic resolution 
or a particular taxon to focus on can involve dozens of further, implicit, 
comparisons that also must be accounted for statistically. 

The identification of interactions between microbes is essential for 
microbial ecology. Correlation networks have proved useful for distilling 
relevant links from a morass of potential interactions. However, inter- 
pretation is still complex for two reasons. First, the abundance of specific 
microorganisms in each microbiota is sampled through a multinomial 
distribution, which leads to large numbers of negative correlations and 
induces a substantial bias in network topology. Second, taxonomic data 
are extremely sparse: most samples have zero abundance of a particular 
organism. Because of these correlation problems, network analyses can 


7 JULY 2016 | VOL 535 | NATURE | 95 


© 2016 Macmillan Publishers Limited. All rights reserved. 


— weeere ve md 


OH O 
AAS ALL AD 
wae 
Homoserine lactone ea 
Hl fe) 
OH 
6G: 0 
el 0 
F OH H 
Acetaminophen 0. 
: HOH 


Rha-Rha-C10-C10 


A OS 


Acetate Indole Acetaminophen 


Secondary 


N oO 4 sulfate metabolites 
At from accessory 
NH, Drug or xenobiotic genome 


altered by 
microbiome 


Tryptophan and 
other amino acids 


Primary metabolites 
from core genome 


Microbial 
enzyme 


Gut microbiota 


Host metabolite 
altered by microbiome 


Gall bladder 


Figure 1 | Sources of metabolites from the human microbiome. The core 
physiology of the microbial cells that make up the microbiome can produce 
by-products and intermediates that affect health, including short-chain fatty 
acids (such as acetate) and tryptophan metabolites. Secondary (or specialized) 
metabolites are produced from accessory genetic elements that are often 
transferred horizontally between microbes. Some of these metabolites, 
including colibactin’’ and rhamnolipids™ (Rha-Rha-C10-C10), are known to 
cause disease. Microbes can also alter metabolites that are produced by the host, 
such as bile acids''® (CA, cholic acid) and even drugs that are consumed, such as 
acetaminophen (paracetamol). DCA, deoxycholic acid; Rha, rhamnose. 


be inherently flawed”. Despite such limitations, taxonomic correlation 
networks have identified microbial interactions that are linked to disease, 
including beneficial and harmful networks of microbes that are associated 
with Crohn's disease”. 

These examples of successful biomarker discovery have yet to provide 
standard guidelines; however, they have produced interesting findings. 
For example, a higher level of taxonomic resolution is not always better. 
16S ribosomal RNA operational taxonomic units (OTUs), which are clus- 
ters of sequences that are defined by sequence identity, at the species level 
are best for matching samples, yet this taxonomic level actually decreases 
the accuracy of classifying individuals as lean or obese”. The level of reso- 
lution is therefore dependent on the context. 


Functional biomarkers 

Shotgun metagenomics, the sequencing of fragments from total DNA 
rather than of specific genes, provides more-complete information about 
the microbial community and enables many powerful analyses, although 


96 | NATURE | VOL 535 | 7 JULY 2016 


the choice can be bewildering, even to experienced researchers in the field. 
As well as identifying taxa down to the level of strains or genomic single- 
nucleotide polymorphisms (SNPs), DNA sequences can be grouped into 
many functional classifications using databases such as KEGG (Kyoto 
Encyclopedia of Genes and Genomes), COG (Clusters of Orthologous 
Groups of Proteins), GO (Gene Ontology) and EggNOG (Evolutionary 
Genealogy of Genes: Non-supervised Orthologous Groups). Metagenom- 
ics studies”** commonly showa surprising consistency in functional pro- 
files, although the limited variation that does exist can often be explained 
by taxonomy. Studies that separate samples of interest from controls at dif- 
ferent functional resolutions, are yet to be adequately performed, however. 
Shotgun metagenomics seems to outperform amplicon-based taxonomic 
analysis in the identification of individuals (compare ref. 25 with ref. 26). 
Re-analysis of 16S rRNA amplicon data using oligotyping”, a technique 
that is based on the fine detail of polymorphisms, has improved resolu- 
tion, and this is demonstrated by its ability to identify sexual partners 
through shared sequences. No examples are thought to exist in which 
shotgun metagenomics has been able to identify a medically relevant 
trait that could not have been revealed through taxonomic analysis alone, 
although the potential for doing so is high. 

Integrating human metagenomic and metabolomic profiles has great 
potential for discriminating between disease traits (Fig. 1). The ability 
to systematically link the variance in metabolomic data between sam- 
ples with changes in the composition and structure of communities of 
microbes from the same samples enables not only improved resolution but 
also the potential to infer the mechanisms that produce observed trends™. 
This potential is highlighted by a study” that shows how the microbiome 
alters bile-acid metabolite profiles during the establishment of Clostridium 
difficile in mice. Similarly, the ability to link metabolite profiles in urine 
and blood serum to microbial metabolism in the gut can help to synthe- 
size links between dysbiosis (an imbalance of microbes in the body) and 
the onset of neurological symptoms that are associated with conditions 
such as autism spectrum disorder in a mouse model”. Metaproteomics 
is also enabling the identification of new biomarkers. Proteins such as 
L-lactate dehydrogenase and arginine deiminase, as well as those that 
are involved in the synthesis of exopolysaccharides, iron metabolism 
and the immune response, seem to be indicative of a healthy human 
oral cavity”. The combination of microbial community profiling with 
metabolomics and proteomics has precipitated understanding of how 
the microbiota responds to specific disease states, including IBD” ™*. The 
combined findings reveal specific species (for example, Faecalibacterium 
prausnitzii), proteins and metabolites that are involved in the metabolism 
of butyrate and bile acids, which can be used to differentiate between 
individuals with inflammation of the ileum that is the result of Crohn’s 
disease and those with inflammation of the colon and a healthy gut. In 
another example”, children with non-alcoholic fatty liver disease show a 
significant increase in Gammaproteobacteria and Prevotella as well as in 
levels of ethanol and certain short-chain fatty acids (SCFAs), which leads 
to an increase in energy production and a decrease in the metabolism of 
carbohydrates and amino acids and in the activity of the urea cycle and 
urea transport systems. 


From correlation to causation 

A crucial challenge for the field is to move beyond associations between 
the microbiome and specific clinical states towards the establishment of 
causality. The importance of MWAS with large cohorts in determining 
causality should not be underestimated. The limitation of the case- 
control model is that it is impossible to distinguish whether the micro- 
biome drives the disease, the disease drives the microbiome or if both 
are modified by a confounding factor. For example, a lack of replication 
of the microbiota differences that separate people with type 2 diabetes 
from controls in Chinese and European cohorts was found to be due to 
variation in the levels of usage of the drug metformin, which is used only 
in the disease state and with different frequencies in the two populations 
and which had a large and unanticipated effect on their microbiotas”. 
Consequently, the effect that had been attributed to the disease was 


© 2016 Macmillan Publishers Limited. All rights reserved. 


actually the result of the treatment. 

Several popular methods exist for identifying causality, each of which 
has specific strengths and weaknesses. Prospective longitudinal studies, 
such as of the CHILD (Canadian Healthy Infant Longitudinal Devel- 
opment) birth cohort, allow researchers to test whether changes in the 
microbiome precede or follow the development of disease. Such studies 
are expensive, however, and can require large populations to capture rare 
events. Ifit is difficult to continue to collect samples, the study population 
can also be affected by attrition. Intervention studies, in which a deliberate 
clinical event such as the administration ofa drug is used to drive change 
in the microbiome and phenotypes, are useful, but it is often unethical to 
withhold treatment from a control group to isolate the effect of the spe- 
cific intervention. Interventions such as faecal microbiota transplantation 
also face substantial regulatory hurdles, especially in the United States. 
The comparison of identical and non-identical twins can be valuable for 
unravelling genetic differences in the host: causality can be established 
because the microbiome is not known to modify the inheritable host 
genome. However, such cohorts are difficult to assemble and privacy 
issues can be considerable, especially when the same twins are used in 
many studies. Animal models can be helpful for establishing mechanisms, 
but the quantitative importance of these mechanisms for human disease 
is often less clear. For example, the demonstration that faecal microbiota 
transplantation from people who are lean or obese to germ-free mice 
can confer differences in adiposity indicates that microbes can affect this 
phenotype, but it does not establish that transplantation can affect the 
weight of obese people”. 


The metabolome reveals important microbial activities 
Metabolomic biomarkers are especially useful for diagnostics because 
changes in metabolism can be rapid and can reveal the physiological state 
of both the host and its microbiota. Such biomarkers are also the end 
products of the metabolism of microbes and they can provide mechanistic 
explanations for particular associations between microbes and disease. 
The metabolome is being characterized through metabolomics (the study 
of the complete repertoire of molecules in the body, which is analogous to 
genomics, the study of the complete repertoire of genes in the genome), 
metabonomics (the comparison of general metabolomics profiles with 
their many unidentified compounds, rather than the comparison of spe- 
cific metabolites within profiles) and exposomics (the study of cumulative 
exposures to molecules from the environment)”. A crucial challenge 
for the characterization of these molecules is that only about 1.8% of 
the chemical data that can be collected with mass spectrometry can be 
annotated”, Unlike the genomics community, the mass-spectrometry 
community lacks adequate mechanisms of knowledge dissemination that 
enable data reuse. To overcome this challenge, the community is develop- 
inga plethora of resources to store data from mass spectrometry, includ- 
ing databases such as MassBank, METLIN, MetaboLights and the Human 
Metabolome Database (HMDB), the Metabolomics Workbench plat- 
form and the software OpenMS. Its efforts have also led to GNPS (Global 
Natural Products Social Molecular Networking), the first crowdsourced 
platform that enables the community-driven curation of mass-spectrom- 
etry data and dissemination of existing knowledge of mass spectrometry 
in the public domain". Ultimately, these databases and infrastructures 
for analysis will allow the estimation of metabolite flux from the genomes 
to enable prediction of the overall function of communities of microbes”. 
Although a few strains of gut microbes are pathogenic, most are harmless 
or beneficial to health; similarly, some molecules that are produced by 
microbes are detrimental to the health of the host®, but most are innocu- 
ous or even beneficial“. Metabolites are particularly important agents of 
the human microbiome. This is because molecules that are produced by 
the microbiota can cross epithelial barriers more freely than the microbes 
to cause systemic effects at distant sites in the body. 

The small-molecule repertoire of the human microbiome consists of 
four groups. The first is composed of primary metabolites, which are mol- 
ecules produced by the catabolic and anabolic reactions that are required 
for cellular growth and homeostasis. The second group comprises 


REVIEW 


specialized metabolites, which includes virulence factors, secondary 
metabolites and natural products. These compounds are produced by 
accessory genetic elements that are often acquired through horizontal 
gene transfer, and they are designed to directly influence the cells of the 
host and other microbes (Fig. 1). Knowledge of changes in the second- 
ary metabolites can be useful for understanding toxins, quorum sensing 
and beneficial secondary metabolites of food such as lycopenoids and 
carotenoids. The third group is composed of metabolites produced by 
cells of the host or from exogenous sources that are directly modified 
by microbial enzymes to create unique chemical products. Knowledge 
of changes in this group can be useful for understanding how microbes 
modify products of the host’s metabolism. The final group is the expo- 
some, which describes the chemistry and metabolites that are encoun- 
tered through exposure to personal-care products, medical intervention, 
food or the environment. Knowledge of changes in this group is especially 
useful for understanding how compounds that are applied to the body, 
whether intentionally or unintentionally, can trigger toxic responses 
or can be modified into forms that differ in activity from the originally 
applied compound. Although decades of research on primary metabolism 
have led to a good understanding of these four groups of metabolites, 
the specialized metabolome of microbes is a veritable sea of unknown 


chemistry*”*. 


Linking metabolomes to health and disease 

Evidence is accumulating that the metabolic output of the microbial 
metabolome has a direct impact on human health. Significant opportu- 
Nities exist to elucidate the mechanisms that result in this effect. However, 
current methods of chemical annotation can identify only a small fraction 
of detected metabolites within the metabolome”, and models for testing 
hypotheses about the interactions among microbes, their molecules and 
the host are challenging to use. 

The best-known examples of microbiome-derived primary metabolites 
that affect human health are probably the SCFAs. SCFAs such as ace- 
tate, propionate and butyrate are produced through the fermentation of 
dietary fibre by gut microbes and then absorbed by epithelial cells, which 
provides them with energy”. Defects in the production of SCFAs have 
been linked to many conditions, including IBD®”, although it is unclear 
whether testing for SCFA levels per se has clinical value. 

The development of germ-free animal models has been very useful for 
identifying primary metabolites produced or altered by the microbiota of 
the host. A comparison of metabolomes from germ-free and colonized 
mice revealed that indole-3-propionic acid and other products of tryp- 
tophan metabolism are found only in mice with an intact microbiota 
and are associated with the presence of Clostridium sporogenes”. These 
tryptophan metabolites are thought to affect neuronal signalling in the 
gut and brain”. But their role in human health remains elusive. 

Some specialized metabolites from the human microbiota are known 
to cause disease. For instance, colibactin induces double-strand breaks in 
the DNA of human cells”. The genetic machinery for the production of 
colibactin can be transferred from pathogenic to non-pathogenic strains 
of E. coli. Colibactin is associated with colorectal cancer in mouse mod- 
els’ and provides an example of how commensal microbes of the gut can 
harbour or acquire specialized metabolites that can result in disease”. 
The true pathogenic elements within the human microbiota might be the 
genetic islands encoding specialized metabolites that circulate within the 
microbial ecosystem, rather than the core genomes of pathogenic species 
(Fig. 1). The prevalence of these genetic islands could be associated with 
the prevalence of microbiome-associated diseases; it is here that the inter- 
face between GWAS and MWAS can best be understood. Tests at the level 
of single genes, such as for genes that are necessary for colibactin produc- 
tion, might prove more useful for identifying preventive treatments than 
would tests for the presence or absence of specific taxa, akin to the way 
that levels of glucose or insulin are measured for diabetes. 


Integration of multi-omics studies 
To comprehensively understand the role of the human microbiome and 


7 JULY 2016 | VOL 535 | NATURE | 97 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


its metabolome in health and disease, integrative analyses are needed that 
apply ‘omics’ techniques to animal or other empirical models. Integrative 
analysis can help to identify the effect of treatment with antibiotics on the 
gut microbiota during infection with Clostridium difficile in both mice 
and people™. A multidisciplinary approach that employs mathematical 
modelling, 16S rRNA gene sequencing, metagenome sequencing and 
animal models identified how microbiotas can help the hosts to resist 
C. difficile infection, which led to the identification of Clostridium scindens 
as a candidate for resistance to infection in mouse models. In a study of 
24 people who took antibiotics while undergoing chemotherapy”, half 
had active C. difficile infections, which suggests that there is an association 
between C. scindens and resistance to infection with C. difficile. Preven- 
tion of C. difficile infections through transfer of C. scindens to animals 
that were undergoing treatment with antibiotics confirmed this role”. 
Metagenomic and metabolomic-based findings” have been used to iden- 
tify the importance of bile acids in this resistance to infection, and subse- 
quent experiments showed that certain levels of specific bile acids were 
associated with resistance to C. difficile during treatment with antibiotics. 
This work is an excellent example of how a comprehensive approach to 
microbiome analysis can link the microbiome to disease. The next step is 
to translate such findings into clinically useful tests. 

The resident microbiota of the human gut has an important role in 
modulating the efficacy and toxicity of pharmaceuticals'**”. Variability 
in the microbiomes of individuals™ leads to differences in the metabo- 
lism of drugs and therefore in effective dose availability and side effects. 
Simultaneous measurement of variability in both the microbiome and the 
metabolome will play an important part in identifying causative mecha- 
nisms of xenobiotic metabolism. The role of microbiome-associated drug 
toxicity is exemplified in the treatment of colon cancer with irinotecan, 
which resulted in decreased efficacy of the drug in 40% of treated indi- 
viduals”. Irinotecan is reactivated in the gut by microbial B-glucuronidase 
enzymes, which leads to diarrhoea and prevents administration of the 
appropriate dose. Inhibitors that modulate the activity of the commen- 
sal microbiota by specifically inhibiting the B-glucuronidase enzyme in 
bacteria are in clinical trials”. This represents a precedent for the trans- 
lation of metabolic mechanisms of the human microbiota into clinical 
applications and highlights the importance of investigations in the fields 
of pharmacomicrobiomics” and pharmacometabonomics*’™. In con- 
junction with testing for the genes that encode these enzymes, inhibitor- 
based therapies could increase the efficacy of irinotecan, although the 
diagnostic and therapeutic system this approach requires has yet to be 
demonstrated. Advanced data acquisition and computing and stream- 
lined analysis pipelines are enabling multi-omics analysis to be performed 
on clinically relevant timescales, and the adaption of multi-omics micro- 
biome analysis in the clinic will probably emerge within the next decade”. 
Future prospects should also reflect on how inhibiting specific enzymes 
of commensal microbes affects the overall activity and structure of the 
gut microbiota in the longer term. 


Dynamics of the microbiome 

Although many MWAS take a case-control approach, understanding 
how the microbiome as a whole changes remains a challenge. Relatively 
few studies have assessed the whole microbiome at many time points; 
such studies point towards using dynamic — rather than static — features 
as the input for MWAS. It is challenging to capture the dynamics of an 
invisible microbial world through snapshots of its current state. However, 
the situation has vastly improved in the past 15 years, during which DNA 
sequencing costs dropped by a factor of about one million. By increasing 
the frequency and depth of observations, the rate and directionality of the 
transfer of bacteria between ecosystems is starting to be inferred. 


Assessing the transfer of microbes between environments 

The application of microbial survey techniques to built environments 
and the people who inhabit those spaces has shown the utility of high 
spatiotemporal resolution for inferring interactions between people and 
surfaces in the environmentat the microbiome level™. But even with daily 


98 | NATURE | VOL 535 | 7 JULY 2016 


sampling and observations at multiple sites on each individual (such as 
the nose, the hand and the foot), as well as their pets and surfaces in their 
home, it is still difficult to make more than comparative statements about 
the microbial similarity of surfaces and changes in this similarity over 
time. Higher-resolution temporal analysis, such as hourly sampling®, can 
improve appreciation of the successional dynamics of these communities. 
These tools have not yet been applied to understanding how consistently 
specific components of the microbiota are transferred to or from people, 
let alone within the body. Alternative approaches, such as using differen- 
tial coverage of parts of the microbial genome to infer activity in a single 
sample”, have great promise for directly revealing activity, but samples 
still need to be assessed across time points because the activity of microbes 
can change rapidly in response to conditions. Direct monitoring of the 
transfer of microbes between environments and of the rapid dynamics in 
those environments will require a substantial improvement in the deter- 
mination of genotypic resolution and temporal and spatial sampling. Near 
real-time microbial epidemiology is being demonstrated with genotypic 
resolution (at the strain level) through the rapid genomic sequencing 
of individual species of pathogenic microbes in hospital settings”. It is 
essential that this technology is developed to be more applicable to entire 
communities of microbes, especially because the most important inputs 
to M<WAS might not be the relative abundance of each microbe or gene 
at a single time point but rather the variations in particular species over 
time, as well as their co-variations in linked environments. 


Tracking pathogenic infections 

Clinical application of MWAS inspires a vision of a future in which the 
studies are used to track entire communities of microorganisms involved 
in the complex ‘pathobiomes that are associated with different disease 
states. For example, the transfer of bacteria from mother to child might 
be tracked and augmented by personalized microbial therapies that range 
from vaginal innoculation™ to customized prebiotic and probiotic sup- 
plements that are based on breast milk. This would require automated 
approaches to quantify the abundance and composition (at serotype reso- 
lution) of whole communities of bacteria, as well as rapid deployment of 
MWAS techniques to determine current health status or to predict future 
health status from the trajectories. Although such sensors are not yet avail- 
able, key platforms are being developed that will provide a substantial 
improvement on existing systems”. However, real-time interpretation of 
the vast quantity of data that are produced by these sensors will require 
a radical improvement in automated data processing. This will demand 
the integration of statistical modelling, high-performance computing and 
engineering to enable high-throughput transfer, interpretation and visu- 
alization of spatiotemporal data. Despite the limitations of existing cor- 
relation network techniques (that is, their sparsity and compositionality), 
network analysis has helped to uncover real associations in complex data. 
One useful example of the prediction of interactions, and subsequent 
validation of the prediction through empirical observation, comes 
from marine microbial ecology: network analysis has been applied to 
microbial-community sequence data to predict an interaction between an 
acoel flatworm (Symsagittifera sp.) and a green microalga (Tetraselmis sp.), 
and the finding was subsequently validated using microscopy’. An exam- 
ple in people is the use of correlation network analysis to demonstrate the 
connectivity of organisms in the microbiota of human milk. Cooperative 
and opportunistic subgroups have been identified in which the oppor- 
tunistic pathogenic species could, in principle, be suppressed through 
competitive exclusion”, pointing to therapeutic approaches based on pro- 
biotics (that directly introduce beneficial competing microorganisms) 
or prebiotics (that encourage the growth of beneficial microorganisms). 
The movement of microbes between environments cannot be captured 
by these methods, but the ability of these microorganisms to establish 
themselves and proliferate on arrival can be inferred through an under- 
standing of the ecological network of their destination and their ability to 
incorporate. In early life, the shifting sands of an infant’s microbiota can 
lead to an increase or decrease in the colonization success of particular 
microorganisms™. These dynamics have been tracked using longitudinal 


© 2016 Macmillan Publishers Limited. All rights reserved. 


characterization in work that has demonstrated the correlations between 
the microbiota of mother and child, especially after vaginal delivery, as 
well as the influence of this interaction on the transitional succession 
of microbial ecology in the child’s gut”. The application of longitudinal 
design to MWAS would significantly improve the ability to understand 
the complex linkage between the microbiome and disease, and it would 
also improve knowledge of the link between environmental exposure and 
health outcomes through MWAS-enabled epidemiological investigations. 
Projects such as the Integrative Human Microbiome Project ({HMP)” are 
beginning to apply these approaches to larger populations. 


The visualization of complex longitudinal data 

Visualization improves the interpretation of data and could help to guide 
clinical decision-making”. For example, temporal dynamics could be 
observed in the human gut following a faecal microbiota transplant to 
treat a C. difficile infection”, and the successional dynamics of infant 
microbial development could be explored”. Better visualization can 
also help to define the stability, resistance to perturbation and resilience 
to change of microbial communities; however, the quality of the initial 
experimental design is important. Healthy adults have unique microbial 
dynamics, yet patterns of stability and resistance show elements of similar- 
ity, which hints at the potential for universal ecological rules that define 
these relationships between individuals”. Determining the frequency at 
which longitudinal samples should be taken to capture the dynamics that 
are relevant to a specific disease state is an open problem”. For instance, 
two studies with different sampling intervals*” found conflicting results 
with regards to the stability of the microbiota during pregnancy, although 
differences in dietary intervention could have confounded the patterns. 
Capturing the temporal dynamics of specific characteristics, such as the 
level of glucose in the blood or behavioural traits, also presents a constant 
challenge. The frequency at which various types of data show patterns that 
enable the integration and mechanistic prediction of microbial interac- 
tions should be considered. In an era of precision medicine, an under- 
standing of when and how often different sources of information must 
be acquired to enable the appropriate integration of data is paramount. 
At best, inappropriate sampling frequencies fail to produce correlations 
even when mechanistic interactions exist; at worst, they produce mis- 
leading information, which might lead to the identification of incorrect 
biomarkers or therapeutic targets. 


From explanation to prediction 
The microbiome, or even the microbiota, could be used to predict the 
onset of disease before it occurs and to guide individualized therapies. 


Stratification on the basis of the microbiome 

The stratification of people for treatment holds considerable promise. For 
example, variation in the toxicity of acetaminophen (paracetamol) in the 
liver is largely caused by differences in how the drug, which is an analogue 
of the naturally occurring amino acid tyrosine, is metabolized through 
the tyrosine sulfonation pathway”. Similarly, the effectiveness of digoxin 
depends on whether the gut of an individual contains specific strains of 
Eggerthella lenta, the plasmids of which encode an enzyme that rapidly 
degrades digoxin and renders it ineffective’. Similar stories are emerg- 
ing for many other classes of drug, which suggests that incorporating 
the gut microbiome into the stratification of participants in clinical trials 
and the prescription of medication could be of great value. An especially 
interesting example is the emerging relationship between trimethylamine 
N-oxide (TMAO) and cardiovascular disease. People can metabolize cho- 
line, which is found in dietary sources such as red meat and cheese, in a 
variety of ways. One such pathway is catalysed by groups of bacteria that 
are found only in some individuals: choline is metabolized to trimethyl- 
amine, which is then oxidized to TMAO, a compound that contributes 
to the formation of atherosclerotic plaques through mechanisms not yet 
well understood”, although work in mice suggests a possible pathway*. 
Inhibition of the enzyme that produces TMAO or targeting relevant 
bacteria could therefore provide potent weapons against heart disease. 


REVIEW 


BOX 2 
Integrating the host 
genome into MWAS 


Although, intuitively, the host genome was thought to be important 
in shaping the microbiome, evidence to support this had been 
lacking. Single genes have been known to exert large effects on the 
gut microbiome in mice; for example, the ob/ob*?!® and Toll-like 
receptor 5 knockout models?!? of obesity have been well studied, 
and the changes in the microbiota that are induced by a single- 
allele mutation can even confer part of the adiposity phenotype 
when transmitted by oral gavage to a genetically normal mouse. 
Consequently, it is well established that a genetic change can trigger 
an aberrant microbial community that is transmissible and can 
transmit the phenotype. Studies of panels of mice have shown that 
diet has a much larger effect on the microbiota than does the host 
genotype’, and this is consistent with the observation that studies 
consisting of only dozens of people are unable to demonstrate that 
monozygotic twins are more similar in composition and function 
of their microbiota than are dizygotic twins”*. However, larger 
studies composed of hundreds of individuals are able to find a 
small association between host genetics and the overall microbial 
community!*"!2, Intriguingly, a few taxa seem to be highly heritable, 
notably Christensenella, which is associated with leanness and even 
leads to weight reduction when fed to germ-free mice inoculated 
with the gut microbiota of obese people!””. 


Conversely, it might be possible to predict whether a particular diet has 
adverse consequences for the heart at the level of the individual rather 
than the population. A study™ involving hundreds of people was able to 
demonstrate this potential for diabetes; it used continuous monitoring of 
blood-glucose levels to understand the effects of standardized meals and 
their dietary components. Remarkably, ice cream was less deleterious than 
white rice for some people’ blood glucose, and differences such as these 
could largely be predicted by the microbiota (and not by other factors). 
Consequently, using the microbiota to reduce the immense variability 
experienced by those who receive dietary therapies holds much promise. 

Several studies have substantially advanced the field towards the goal 
of using the microbiome or the microbiota to predict disease before it 
occurs. Fascinatingly, different diseases have different dynamics. Gingi- 
vitis, an inflammation of the gums that can be reversed with thorough 
cleaning of the teeth, shows relapse trends that are specific to individu- 
als, which indicates that a person's unique gingivitis- causing community 
of microbes returns in a predictable way*’. By contrast, many individu- 
als carry the same community of dental-caries-causing bacteria, yet the 
emergence of caries can be predicted months in advance of observable 
clinical symptoms by monitoring changes in the microbiota®. Similarly, 
the development of rheumatoid arthritis can be predicted using both oral 
and gut microbial biomarkers”. The potential use of oral biomarkers to 
predict disease that emerges at less accessible sites in the body is exciting. 
The oral microbiota and gut microbiota share many community mem- 
bers, yet the structures of communities are highly distinct, and only weak 
associations have been found between them”. The oral cavity provides 
an ideal site for non-invasive sampling and biomarker testing; the ability 
to use the oral microbiota to predict disease, following MWAS, there- 
fore has tremendous promise. Predictive models are also being applied 
to many other sites in the body and to many other conditions, including 
obesity”, IBD®” and acne”). 


An evidence scale for microbiome studies 
Although many studies have reported links between the microbiome and 


disease, technical variation between the studies, the effects of which often 


7 JULY 2016 | VOL 535 | NATURE | 99 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


exceed those of the underlying biology, makes it difficult to compare and 
interpret MWAS findings”. Efforts to quantify methodological effects 
enable considerable progress to be made towards performing large-scale 
epidemiological studies of the microbiome””’, but the ability to determine 
when specific biases require studies to be analysed separately rather than 
together still relies on intuition. 

Crohn's disease represents one of the best-studied links between the 
microbiome and disease. Multiple studies'*’*”*, including investigations 
of a cohort of Swedish twins (n = 40 pairs of twins)”, have revealed 
the depletion of beneficial members of the microbiota (for example, 
F. prausnitzii, a producer of butyrate) in people with inflammation in 
the small intestine that is associated with Crohn’s disease, known as 
ileal Crohn’s disease, compared with those who have inflammation of 
the colon or with healthy individuals. Increases in Proteobacteria have 
also been seen in these and many other studies*’*’. Analysis of faecal 
samples from the Swedish twin cohort* also revealed the depletion in 
ileal Crohn's disease of proteins required for the metabolism of butyrate, 
whereas metabolite analysis™ revealed an increase in the amounts of some 
bile-acid metabolites and pancreatic enzymes, as well as thousands of 
unidentified metabolites, that could be used to differentiate people with 
Crohn's disease from healthy individuals. 

Type 1 diabetes has been studied in many disparate but small cohorts 
(n< 20 per group per study) of newly diagnosed children”*”’. These stud- 
ies identified an elevated relative abundance of Bacteroides and a reduced 
relative abundance of Prevotella in those with the disease compared with 
controls. A longitudinal study” of children with a high risk of developing 
diabetes determined that increases in a diversity (the diversity of species at 
a particular site in the body) during development were slowed in children 
who went on to develop diabetes (n = 4) but not in seroconverters without 
clinical symptoms (n = 7) or in healthy children (n = 22). A metabolic 
study in a different cohort found that children who developed diabetes 
(n= 50) had lower levels of triglycerides compared with controls (n = 67). 
Seroconversion was associated with a transient increase in 2-hydroxy- 
butryate and a decrease in ketoleucine. Some of these metabolites might 
have microbial origins. 

Rheumatoid arthritis, a disease not typically thought of as 
being associated with the gut or the mouth, has been linked to the 
microbiomes of both. People with rheumatoid arthritis demonstrate 
consistently increased relative abundances of species of Prevotella in their 
oral and gut microbiotas**””*. Those with newly diagnosed (n= 31) and 
chronic (m = 32) rheumatoid arthritis have higher rates of periodontal 
disease than do healthy controls (n = 18), even when other risk factors 
such as age and smoking are taken into account”. Amplicon sequenc- 
ing has shown that Prevotella and Leptotrichia OTUs are increased in 
individuals with rheumatoid arthritis, independent of their periodon- 
tal disease status”*. Metagenomic profiling of oral and gut microbiomes 
has identified elevated levels of Prevotella copri in people with rheuma- 
toid arthritis (n = 115) compared with controls (n = 97), as well as an 
enrichment in Gram-positive microorganisms, including members of 
the family Veillonellaceae”’. The presence of Lactobacillus salivarius 
in the oral cavity and faeces correlates positively with antibody titres, 
and this microorganism was more likely to be present in active cases of 
rheumatoid arthritis than in controls. Treatment with disease-modifying 
antirheumatic drugs can partially restore characteristics of the control 
microbiome, including decreased levels of Prevotella, in individuals with 
rheumatoid arthritis”. 

Cardiovascular disease” has been linked to high levels of TMAO, 
a metabolite of phosphotidylcholine, and TMAO is strongly correlated 
with both atherosclerotic plaques in a mouse model*® and adverse car- 
diovascular outcomes in people”””’. TMAO has been implicated in other 
conditions that involve the vascular system, including renal disease’ 
and colon cancer’. Treatment with antibiotics attenuates the production 
of TMAO in both mice’ and people” after challenge with phosphotidyl- 
choline. Alterations have been seen in 16S rRNA amplicon sequencing 
profiles of adults from Sweden” and China’” who have experienced 
cardiovascular events, although the same OTUs were not identified in 


100 | NATURE | VOL 535 | 7 JULY 2016 


both cohorts. TMAO might also modulate platelet function and the 
risk of developing thrombosis in people’. Subsequent experiments in 
conventional mice have confirmed that TMAO has a role in thrombo- 
sis, whereas germ-free mice seem to be protected from developing this 
phenotype™. In conventional mice, long-term exposure to dietary cho- 
line altered the composition of the microbiome, and several candidate 
taxa, including the families Lachnospiraceae and Mogibacteriaceae, were 
negatively associated with thrombosis. Interestingly, the identification 
of the role of TMAO in cardiovascular disease began ina study of serum 
metabolites, and only later moved to studies of the microbiome. 

The links between autism spectrum disorder and the microbiome 
remain controversial; although studies in people have provided statisti- 
cally significant associations, they can be confounded by factors that 
include the diet, gastrointestinal issues and drugs. A 16S rRNA ampli- 
con study showed that people with autism spectrum disorder (n = 20) 
had a lower a diversity than did neurotypical individuals (n = 20) 
(ref. 11). Autism spectrum disorder was associated with higher levels of 
Akkermansia and fewer species of fermenter bacteria, including Prevo- 
tella, Coprococcus and Veillonellaceae"’. A study of the offspring of mice 
who had undergone maternal immune activation (MIA) showed that 
alterations occur in the microbiomes and metabolomes of such mice, 
including a reduction in the levels of members of the family Lachno- 
spiraceae, which ferment SCFAs”. The introduction of Bacteroides 
fragilis, acommon commensal microbe, led to decreased expression of 
4-ethylphenylsulfate and corrected behaviourial symptoms. The admin- 
istration of 4-ethylphenylsulfate was sufficient to transmit symptoms of 
anxiety to wild-type mice”’ and led to permanent immune dysfunction”. 

Despite the emergence of some common themes such as the pres- 
ence of specific taxa, overall trends in a diversity and the ability to sepa- 
rate cases and controls using metrics of f diversity (the differences in 
community composition between different samples), it is impossible to 
determine whether a particular condition has a smaller or larger effect 
on the diversity of the microbiota than another, owing to the way that 
individual studies are conducted. A set of standardized protocols would 
enable many different biological and technical effects to be placed on 
a scale that compares common effect sizes. The Microbiome Quality 
Control Project is beginning to do this for technical effects by com- 
paring the specific effects of sample storage, DNA extraction, PCR 
amplification and bioinformatics pipelines, all of which can have sur- 
prisingly large effects; for example, methods and databases used in the 
assignment of taxonomy can have much larger effects on the apparent 
profile of a microbiome than does which biological specimen was exam- 
ined’®*, Large-scale efforts such as the Earth Microbiome Project’” and 
American Gut are beginning to address these issues by studying tens 
of thousands of samples using common methods. The dream would 
be to provide quantitative information that indicates which biological 
effects are larger than specific technical effects (to facilitate a rational 
choice for which studies to compare) and describes the directionality 
of effects, which would enable the use of generalized linear models to 
detrend for specific variables so that subtle effects can be seen against 
the background. For example, American Gut has observed that the age of 
an individual and their self-reported frequency of alcohol consumption 
have approximately equal statistically significant effects on the diversity 
of the gut microbiota: to measure the influence of one variable accurately, 
it is therefore necessary to detrend for the other (American Gut, unpub- 
lished observations). By contrast, body mass index (BMI) has a much 
smaller, although still detectable effect, on the gut microbiota, which 
means that controls for age and alcohol use must be applied (or the data 
detrended) to understand the specific effects of BMI. The development 
ofa scale for this type of effect size would also be enormously useful for 
scoping out new studies: it would enable an educated guess to be made 
about the expected effect size of an intervention or condition from a 
large database of past studies of similar phenomena, and the number of 
participants and longitudinal sampling design (if applicable) could be 
scoped out rationally on the same basis, to the relief of both investigators 
and their institutional review boards. 


© 2016 Macmillan Publishers Limited. All rights reserved. 


Developing a microbial Global Positioning System 

An important challenge for the field is to move beyond abstract maps of 
the microbiome, which enables multivariate samples to be placed in the 
context of other samples. It is important to understand which factors, 
including the host genome (Box 2), can change the microbiome from a 
given starting point ona ‘map, as well as where the ideal endpoint would 
be. Such a microbial Global Positioning System (GPS) would comprise a 
defined start point, a defined end point and directions for how to get from 
one to the other and would depend on the standardization of results from 
microbiome studies so that each participant can be located accurately on 
the map and their progress tracked. It also relies on well-defined clinical 
cohorts that enable desirable and undesirable endpoints to be assessed. 
Unstratified patients have the potential to be placed anywhere on the 
map, and their initial location is determined, for example, by principal 
coordinates analysis (PCoA) of UniFrac distances between samples’, 

as performed by the Human Microbiome Project and American Gut. 
Stratification is then performed to identify certain groups of people in 
different parts of the map, according to specific biomarkers, such as genes, 
functions, metabolites or networks of these features, and perhaps by cross- 
ing different levels of analysis. These biomarkers are then used to relocate 
study participants to appropriate regions of the map, which helps to sug- 
gest specific treatments that would move them from their present location 
to another (Fig. 2). For example, a small change in the diet might provide 
a subtle shift in location on the map and treatment with antibiotics might 
produce a larger shift whereas faecal transplantation could be considered 
‘teleportation. Readout of biomarkers over time would allow the progress 
of each participant to be tracked from unhealthy to healthier regions of 
the map. Overall, many more participants would be expected to reach 
a healthy location on the map than would be possible with unstratified 
treatment, although genetic defects, intractable microbiome states or 
other factors might prevent the recovery of some. This vision requires a 
substantially faster, cheaper and more accurate readout of the microbiome 
across multiple levels than is possible at present, although it will provide 
an exceptionally powerful and clinically relevant model after it has been 
subjected to the appropriate regulatory processes. 


Perspective 
The considerable power of using the microbiome, or even the inexpen- 
sively assayed microbiota, to separate cases from controls, as well as to 
predict responses to treatment or the development of diseases in the 
absence of treatment, has already been demonstrated through carefully 
controlled MWAS in research settings. To further develop these tech- 
niques for robust clinical use, MWAS must be validated in larger and 
more diverse populations. Methodologies must also be standardized so 
that differences in the size of technical effects between laboratories do not 
outweigh differences in the size of biological effects, which can make stud- 
ies difficult to combine”*”’. This problem remains a crucial challenge to 
overcome and prevents findings from being developed into clinical tests. 
Longitudinal studies have been especially informative in revealing 
microbiome dynamics that cannot be observed through a before-after 
model. In infants, where profound changes in the microbiota and micro- 
biome occur in the first three years of life, a more detailed understanding 
of the developmental process, and deviations from it, is required to under- 
stand whether changes introduced by diet, environmental exposures, 
antibiotics and other factors in early life keep the microbiome on track or 
divert it towards danger. Similarly, moving away from taxonomic inven- 
tories towards an understanding of the genes, transcripts, proteins and 
metabolites of the microbiome in a multi-omics, systems-biology context 
is crucial for generalizing our understanding of a wide range of diseases in 
which the microbiome is involved, as well as for developing biomarkers 
that could be the basis of useful clinical tests. However, these are conflict- 
ing imperatives: multi-omics studies greatly increase the cost of analysing 
each sample, which means that longitudinal studies on large populations 
quickly become infeasible and tests are too expensive and slow to apply 
on clinically relevant timescales. Consequently, even higher-throughput 
and cheaper methods to process samples for multi-omics studies, as well 


REVIEW 


Unknown pathological 
e status 


Hnenittt 


Stratification according 
to biomarkers | 
Type A pathology 
© © © © @ @ @ @ @ e 


erent 


Precision treatment a 


Individuals 
assigned to subpools 


Re-stratification 
Type B pathology 


Some individuals 
achieve healthy status 


Precision treatment 
Re-stratification 


Type C pathology 


e e 
More individuals AA 
achieve healthy status MAMA 
Precision treatment 
Re-stratification 
Healthy 


it 


{ 


*. °te 7 e 
©&e 0 s se 
oot a °° e 
oe oA ee ZA 

2 x we ° 


PC1 


Figure 2 | Developing a microbial Global Positioning System to stratify 
individuals and to guide their treatment. An unstratified pool of individuals 
(black), all of whom have the same disease but with different underlying states 
(red, blue and grey), are stratified according to a biomarker from the microbiota, 
the microbiome or the metabolome (differentiated on a PCoA plot (bottom) or 
other analysis). This enables treatments to be chosen for each subpool, which 
facilitates movement from an ‘unhealthy’ region to a ‘healthy’ region of the 
microbial ‘map. The position of an individual in the main pool indicates the 
same person over time. The microbial Global Positioning System therefore 
enables determination of the current location of an individual in terms of their 
microbiome configuration, as well as a prediction of their final destination and 
directions for how to get there. Ideally, this moves all individuals in the pool to 
a healthy status (green) and microbiome, although in real-world situations no 
treatment will work perfectly. PC, principal coordinate. 


as improved modelling techniques that derive systems-level dynamic 
parameters from fewer samples, are urgently required. These advances 
will rapidly bring us nearer to the dream of a microbial GPS. The Human 
Microbiome Project, the Earth Microbiome Project, American Gut and 
other large-scale efforts have already, and very effectively, provided a 
microbial ‘map’ that enables healthy and diseased samples to be placed 


7 JULY 2016 | VOL 535 | NATURE | 101 


© 2016 Macmillan Publishers Limited. All rights reserved. 


REVIEW 


in context, provided that consistent laboratory and bioinformatics meth- 
ods are used. In the next few years, data that are collected using consistent 
protocols will enable intervention studies from many investigators to be 
aggregated to build a general picture ofhow the microbiome can change in 
specific directions in multivariate space. This understanding will facilitate 
the provision of ‘turn-by-turn’ directions that enable individuals to use 
their microbiome and perhaps even their genotype to understand where 
they might want to go on this map and how they can get there most effec- 
tively, in a way that preserves their lifelong health. = 


Received 7 December 2015; accepted 6 May 2016. 


1. 
2. 


Manichanh, C. et a/. Anal gas evacuation and colonic microbiota in patients with 
flatulence: effect of diet. Gut 63, 401-408 (2014). 

Frank, D. N. et al. Molecular-phylogenetic characterization of microbial 
community imbalances in human inflammatory bowel diseases. Proc. Natl Acad. 
Sci. USA 104, 13780-13785 (2007). 

This study linked the microbiota to IBD and also demonstrated that various 
forms of the condition have distinct signatures of microbiota. 

Lewis, J. D. et al. Inflammation, antibiotics, and diet as environmental stressors of 
the gut microbiome in pediatric Crohn’s disease. Cell Host Microbe 18, 489-500 
(2015). 

Ley, R. E., Turnbaugh, P. J., Klein, S. & Gordon, J. |. Microbial ecology: human gut 
microbes associated with obesity. Nature 444, 1022-1023 (2006). 

Turnbaugh, P. J. et a/. An obesity-associated gut microbiome with increased 
capacity for energy harvest. Nature 444, 1027-1031 (2006). 

Koeth, R. A. et al. Intestinal microbiota metabolism of L-carnitine, a nutrient in 
red meat, promotes atherosclerosis. Nature Med. 19, 576-585 (2013). 

Kostic, A. D. et al. Genomic analysis identifies association of Fusobacterium with 
colorectal carcinoma. Genome Res. 22, 292-298 (2012). 

This study identified high levels of Fusobacterium nucleatum in tissue from 
human tumours; the bacterium was later confirmed to cause tumours in 
experiments in animals. 

Scher, J. U. et a/. Expansion of intestinal Prevotella copri correlates with enhanced 
susceptibility to arthritis. eLife 2,e01202 (2013). 

This paper provided the first evidence to directly link the gut microbiota to 
rheumatoid arthritis in people. 

Naseribafrouei, A. et a/. Correlation between the human fecal microbiota and 
depression. Neurogastroenterol. Motil. 26, 1155-1162 (2014). 


. Scheperjans, F. et al. Gut microbiota are related to Parkinson’s disease and 


clinical phenotype. Mov. Disord. 30, 350-358 (2015). 


. Kang, D. W. et al. Reduced incidence of Prevotella and other fermenters in 


intestinal microflora of autistic children. PLoS ONE 8, e68322 (2013). 


. Kostic, A. D., Howitt, M. R. & Garrett, W. S. Exploring host-microbiota interactions 


in animal models and humans. Genes Dev. 27, 701-718 (2013). 


. Barr, J.J. et al. Bacteriophage adhering to mucus provide a non-host-derived 


immunity. Proc. Natl Acad. Sci. USA 110, 10771-10776 (2013). 


. Haiser, H. J. et al. Predicting and manipulating cardiac drug inactivation by the 


human gut bacterium Eggerthella lenta. Science 341, 295-298 (2013). 

This study provided a mechanism to underpin the high variation between 
individuals in efficacy of the cardiac drug digoxin, which was suspected (but 
not yet proven) to be linked to its metabolism by Eggerthella lenta. 


. Arthur, J.C. etal. Intestinal inflammation targets cancer-inducing activity of the 


microbiota. Science 338, 120-123 (2012). 


. Knights, D. et a/. Bayesian community-wide culture-independent microbial 


source tracking. Nature Methods 8, 761-763 (2011). 


. Arumugam, M. et al. Enterotypes of the human gut microbiome. Nature 473, 


174-180 (2011). 


. Gevers, D. et al. The treatment-naive microbiome in new-onset Crohn's disease. 


Cell Host Microbe 15, 382-392 (2014). 

This study of treatment-naive children who had been freshly diagnosed with 
Crohn’s disease enabled the effects of treatment to be separated from those of 
the condition. 


. Kuezynski, J. et al. Microbial community resemblance methods differ in their 


ability to detect biologically relevant patterns. Nature Methods 7, 813-819 (2010). 


. Friedman, J. & Alm, E. J. Inferring correlation networks from genomic survey 


data. PLoS Comput. Biol. 8, e1002687 (2012). 


. Lovell, D., Pawlowsky-Glahn, V., Egozcue, J. J., Marguerat, S. & Bahler, J. 


Proportionality: a valid alternative to correlation for relative data. PLoS Comput. 
Biol. 11, €1004075 (2015). 


. Knights, D., Parfrey, L. W., Zaneveld, J., Lozupone, C. & Knight, R. Human- 


associated microbial signatures: examining their predictive value. Cell Host 
Microbe 10, 292-296 (2011). 


. Turnbaugh, P. J. et a/. A core gut microbiome in obese and lean twins. Nature 


457, 480-484 (2009). 


. The Human Microbiome Project Consortium. Structure, function and diversity of 


the healthy human microbiome. Nature 486, 207-214 (2012). 


. Fierer, N. et a/. Forensic identification using skin bacterial communities. Proc. 


Natl! Acad. Sci. USA 107, 6477-6481 (2010). 


. Franzosa, E. A. et al. Identifying personal microbiomes using metagenomic 


codes. Proc. Natl Acad. Sci. USA 112, E2930-E2938 (2015). 


. Eren, A. M. et al. Oligotyping: differentiating between closely related microbial 


taxa using 16S rRNA gene data. Methods Ecol. Evol. 4, 1111-1119 (2013). 
This paper demonstrated how careful analysis of exact 16S rRNA sequences 


102 | NATURE | VOL 535 | 7 JULY 2016 


28. 


29. 
30. 


31. 
32. 


33. 
34. 
35. 
36. 


37. 


38. 
39. 


40. 


41. 
42. 
43. 


44. 


45. 
46. 


47. 
48. 


49. 
50. 
51. 
52. 
53. 


54. 
59. 
56. 


57. 


58. 


that avoids clustering into OTUs can reveal fine-grained information that can 

be useful for forensic matching. 

Noecker, C. et a/. Metabolic model-based integration of microbiome 

taxonomic and metabolomic profiles elucidates mechanistic links between 

ecological and metabolic variation. mSystems 13, http://dx.doi.org/10.1128/ 
mSystems.00013-15 (2015). 

Koenigsknecht, M. J. et al. Dynamics and establishment of Clostridium difficile 
infection in the murine gastrointestinal tract. Infect. Immun. 83, 934-941 (2015). 
Hsiao, E. Y. et al. Microbiota modulate behavioral and physiological 

abnormalities associated with neurodevelopmental disorders. Ce// 155, 
1451-1463 (2013). 

This study showed that the phenotype of a mouse model of autism spectrum 

disorder could be traced, in part, to a single molecule (4-ethylphenylsulfate) 

and a shift in the microbiota that can be partially restored using a probiotic. 
Belda-Ferre, P. et al. The human oral metaproteome reveals potential biomarkers 

for caries disease. Proteomics 15, 3497-3507 (2015). 

Willing, B. P. et al. A pyrosequencing study in twins shows that gastrointestinal 

microbial profiles vary with inflammatory bowel disease phenotypes. 

Gastroenterology 139, 1844-1854 (2010). 

Erickson, A. R. et al. Integrated metagenomics/metaproteomics reveals human 

host-microbiota signatures of Crohn’s disease. PLoS ONE 7, e49138 (2012). 

Jansson, J. et al. Metabolomics reveals metabolic biomarkers of Crohn’s disease. 

PLoS ONE 4, e6386 (2009). 

Michail, S. et al. Altered gut microbial energy and metabolism in children with 

non-alcoholic fatty liver disease. FEMS Microbiol. Ecol. 91, 1-9 (2015). 

Forslund, K. et al. Disentangling type 2 diabetes and metformin treatment 

signatures in the human gut microbiota. Nature 528, 262-266 (2015). 

This paper compared two discordant studies of microbiomes in type 2 

diabetes and showed that the alleged effect of diabetes could be attributed 

mostly to differences in use of metformin, which has an unexpectedly large 

effect on the microbiome, between the two populations. 

Ridaura, V. K. et al. Gut microbiota from twins discordant for obesity modulate 

metabolism in mice. Science 341, 1241214 (2013). 

This study demonstrated that phenotypes such as increased adiposity could 

be transferred from people to mice using personalized culture collections. 

Li, H. & Jia, W. Cometabolism of microbes and host: implications for drug 

metabolism and drug-induced toxicity. Clin. Pharmacol. Ther. 94, 574-581 (2013). 
icholson, J. K., Lindon, J. C. & Holmes, E. ‘Metabonomics’: understanding 

he metabolic responses of living systems to pathophysiological stimuli 

via multivariate statistical analysis of biological NMR spectroscopic data. 

Xenobiotica 29, 1181-1189 (1999). 

Wishart, D. S. Emerging applications of metabolomics in drug discovery and 

precision medicine. Nature Rev. Drug Discov. http://dx.doi.org/10.1038/ 
nrd.2016.32 (2016). 

da Silva, R. R., Dorrestein, P. C. & Quinn, R. A. Illuminating the dark matter in 
metabolomics. Proc. Nat! Acad. Sci. USA 112, 12549-12550 (2015). 

Gilbert, J. A. & Henry, C. Predicting ecosystem emergent properties at multiple 

scales. Environ. Microbiol. Rep. 7, 20-22 (2015). 

Allen, L. et al. Pyocyanin production by Pseudomonas aeruginosa induces 
neutrophil apoptosis and impairs neutrophil-mediated host defenses in vivo. 

J. Immunol. 174, 3643-3649 (2005). 

Puertollano, E., Kolida, S. & Yaqoob, P. Biological significance of short-chain fatty 

acid metabolism by the intestinal microbiome. Curr. Opin. Clin. Nutr. Metab. Care 
17, 139-144 (2014). 

Cimermancic, P. et al. Insights into secondary metabolism from a global analysis 

of prokaryotic biosynthetic gene clusters. Cel/ 158, 412-421 (2014). 

Donia, M. S. et al. A systematic analysis of biosynthetic gene clusters in the 
human microbiome reveals a common family of antibiotics. Ce// 158, 1402- 
1414 (2014). 

This paper showed that the human microbiome harbours many biosynthetic 

gene clusters, including those required for the production of antibiotics. 

Cummings, J. H. Fermentation in the human large intestine: evidence and 
implications for health. Lancet 1, 1206-1209 (1983). 

Huda-Faujan, N. et a/. The impact of the level of the intestinal short chain fatty 

acids in inflammatory bowel disease patients versus healthy subjects. Open 

Biochem. J. 4, 53-58 (2010). 

Rios-Covian, D. et al. Intestinal short chain fatty acids and their link with diet and 

human health. Front. Microbiol. 7, 185 (2016). 

Wikoff, W. R. et a/. Metabolomics analysis reveals large effects of gut microflora on 

mammalian blood metabolites. Proc. Nat! Acad. Sci. USA 106, 3698-3703 (2009). 

Donia, M. S. & Fischbach, M. A. Small molecules from the human microbiota. 

Science 349, 1254766 (2015). 

ougayréde, J. P. et al. Escherichia coli induces DNA double-strand breaks in 

eukaryotic cells. Science 313, 848-851 (2006). 

Putze, J. et al. Genetic structure and distribution of the colibactin genomic 

island among members of the family Enterobacteriaceae. Infect. Immun. 77, 

4696-4703 (2009). 

Buffie, C. G. et a/. Precision microbiome reconstitution restores bile acid 

mediated resistance to Clostridium difficile. Nature 517, 205-208 (2015). 

Langille, M. G. et a/. Predictive functional profiling of microbial communities 

using 16S rRNA marker gene sequences. Nature Biotechnol. 31, 814-821 (2013). 

Allegretti, J. R. et al. Recurrent Clostridium difficile infection associates with 

distinct bile acid and microbiome profiles. Aliment. Pharmacol. Ther. 43, 
1142-1153 (2016). 

Maurice, C. F., Haiser, H. J. & Turnbaugh, P. J. Xenobiotics shape the physiology 

and gene expression of the active human gut microbiome. Ce// 152, 39-50 (2013). 

Mari, S., Boelsterli, U. A. & Redinbo, M. R. Understanding and modulating 


© 2016 Macmillan Publishers Limited. All rights reserved. 


59. 


60. 


61. 


62. 
63. 
64. 
65. 
66. 
67. 
68. 
69. 


70. 
71. 
72. 


73. 


74, 


75. 
76. 


77. 
78. 
L9: 
80. 
81. 
82. 
83. 


84. 


85. 
86. 
87. 
88. 
89. 


mammalian-microbial communication for improved human health. Annu. Rev. 
Pharmacol. Toxicol. 54, 559-580 (2014). 

Wallace, B. D. et al. Alleviating cancer drug toxicity by inhibiting a bacterial 
enzyme. Science 330, 831-835 (2010). 

This paper demonstrated that the cancer therapeutic drug irinotecan causes 
severe diarrhoea because of its reactivation and metabolism by bacterial 
B-glucuronidases; inhibiting these enzymes with a drug that targets the 
bacteria, rather than the host, reduces toxicity. 

ElRakaiby, M. et al. Pharmacomicrobiomics: the impact of human microbiome 
variations on systems pharmacology and personalized therapeutics. OMICS 18, 
402-414 (2014). 

Clayton, T. A., Baker, D., Lindon, J. C., Everett, J. R. & Nicholson, J. K. 
Pharmacometabonomic identification of a significant host-microbiome 
metabolic interaction affecting human drug metabolism. Proc. Nat! Acad. Sci. 
USA 106, 14728-14733 (2009). 

This study provided the first link between the toxicity of a drug (in this case, 
acetaminophen, a widely used analgesic) and microbial metabolism. 

Wilson, I. D. Drugs, bugs, and personalized medicine: pharmacometabonomics 
enters the ring. Proc. Natl Acad. Sci. USA 106, 14187-14188 (2009). 

Quinn, R.A. et a/. From sample to multi-omics conclusions in under 48 hours. 
mSystems http://dx.doi.org/10.1128/mSystems.00038-16 (2016). 

Lax, S. et al. Longitudinal analysis of microbial interaction between humans and 
the indoor environment. Science 345, 1048-1052 (2014). 

Gibbons, S. M. et al. Ecological succession and viability of human-associated 
microbiota on restroom surfaces. Appl. Environ. Microbiol. 81, 765-773 (2015). 
Korem, T. et a/. Growth dynamics of gut microbiota in health and disease inferred 
from single metagenomic samples. Science 349, 1101-1106 (2015). 

Quick, J. et al. Rapid draft sequencing and real-time nanopore sequencing ina 
hospital outbreak of Salmonella. Genome Biol. 16, 114 (2015). 
Dominguez-Bello, M. G. et al. Partial restoration of the microbiota of cesarean- 
born infants via vaginal microbial transfer. Nature Med. 22, 250-253 (2016). 
Goyal, M. S., Venkatesh, S., Milbrandt, J., Gordon, J. |. & Raichle, M. E. Feeding the 
brain and nurturing the mind: linking nutrition and the gut microbiota to brain 
development. Proc. Nat! Acad. Sci. USA 112, 14105-14112 (2015). 

Biteen, J. S. et al. Tools for the microbiome: nano and beyond. ACS Nano 10, 
6-37 (2016). 

Lima-Mendez, G. et a/. Determinants of community structure in the global 
plankton interactome. Science 348, 1262073 (2015). 

Sam Ma, Z. et a/. Network analysis suggests a potentially ‘evil’ alliance of 
opportunistic pathogens inhibited by a cooperative network in human milk 
bacterial communities. Sci. Rep. 5, 8275 (2015). 

Backhed, F. et al. Dynamics and stabilization of the human gut microbiome 
during the first year of life. Cell Host Microbe 17, 690-703 (2015); erratum 17, 
852 (2015). 
The Integrative HMP (iHMP) Research Network Consortium. The Integrative 
Human Microbiome Project: dynamic analysis of microbiome-host omics 
profiles during periods of human health and disease. Cell Host Microbe 16, 
276-289 (2014). 
Vazquez-Baeza, Y., Pirrung, M., Gonzalez, A. & Knight, R. EMPeror: a tool for 
visualizing high-throughput microbial community data. Gigascience 2, 16 (2013). 
Weingarden, A. et al. Dynamic changes in short- and long-term bacterial 
composition following fecal microbiota transplantation for recurrent Clostridium 
difficile infection. Microbiome 3, 10 (2015). 

This paper introduced animation techniques that revealed the transformation 
of the whole microbiota during faecal microbiota transplantation for C. difficile 
infection. 

Koenig, J. E. et al. Succession of microbial consortia in the developing infant gut 
microbiome. Proc. Nat! Acad. Sci. USA 108 (suppl. 1), 4578-4585 (2011). 
Lozupone, C. A. et a/. Meta-analyses of studies of the human microbiota. Genome 
Res. 23, 1704-1714 (2013). 

Flores, G. E. et a/. Temporal variability is a personalized feature of the human 
microbiome. Genome Biol. 15, 531 (2014). 

Shade, A. et al. Conditionally rare taxa disproportionately contribute to temporal 
changes in microbial diversity. mBio 5, e€01371-14 (2014). 

DiGiulio, D. B. et al. Temporal and spatial variation of the human microbiota 
during pregnancy. Proc. Natl Acad. Sci. USA 112, 11060-11065 (2015). 

Koren, O. et al. Host remodeling of the gut microbiome and metabolic changes 
during pregnancy. Cell 150, 470-480 (2012). 

Wang, Z. et al. Prognostic value of choline and betaine depends on intestinal 
microbiota-generated metabolite trimethylamine-N-oxide. Eur. Heart J. 35, 
904-910 (2014). 

Zeevi, D. et al. Personalized nutrition by prediction of glycemic responses. Cell 
163, 1079-1094 (2015). 

This study showed that individual glycaemic responses could be predicted 
using the microbiome; it also revealed that although population averages 
match conventional glycaemic-index values, the responses of individuals are 
highly idiosyncratic and dependent on the microbiome. 

Teng, F. et al. Prediction of early childhood caries via spatial-temporal variations 
of oral microbiota. Ce// Host Microbe 18, 296-306 (2015). 

Huang, S. et al. Predictive modeling of gingivitis severity and susceptibility via 
oral microbiota. ISME J. 8, 1768-1780 (2014). 

Zhang, X. et al. The oral and gut microbiomes are perturbed in rheumatoid 
arthritis and partly normalized after treatment. Nature Med. 21, 895-905 (2015). 
Ding, T. & Schloss, P. D. Dynamics and associations of microbial community 
types across the human body. Nature 509, 357-360 (2014). 

Cotillard, A. et a/. Dietary intervention impact on gut microbial gene richness. 
Nature 500, 585-588 (2013). 


REVIEW 


90. Walters, W.A., Xu, Z. & Knight, R. Meta-analyses of human gut microbes 
associated with obesity and IBD. FEBS Lett. 588, 4223-4233 (2014). 

91. Kang, D., Shi, B., Erfe, M. C., Craft, N. & Li, H. Vitamin B,, modulates the 
transcriptome of the skin microbiota in acne pathogenesis. Sci. Transl. Med. 7, 
293ra103 (2015). 

92. Sinha, R., Abnet, C. C., White, O., Knight, R. & Huttenhower, C. The microbiome 
quality control project: baseline study design and future directions. Genome Biol. 
16, 276 (2015). 

93. Sinha, R. et a/. Collecting fecal samples for microbiome analyses in epidemiology 
studies. Cancer Epidemiol. Biomarkers Prev. 25, 407-416 (2016). 

94. Sokol, H. et al. Faecalibacterium prausnitzii is an anti-inflammatory commensal 
bacterium identified by gut microbiota analysis of Crohn disease patients. Proc. 
Nat! Acad. Sci. USA 105, 16731-16736 (2008). 

95. de Goffau, M. C. et al. Fecal microbiota composition differs between children with 
B-cell autoimmunity and those without. Diabetes 62, 1238-1244 (2013). 

96. Giongo, A. et al. Toward defining the autoimmune microbiome for type 1 
diabetes. ISME J. 5, 82-91 (2011). 

97. Kostic, A. D. et al. The dynamics of the human infant gut microbiome in 
development and in progression toward type 1 diabetes. Cell Host Microbe 17, 
260-273 (2015). 

98. Scher, J. U. et al. Periodontal disease and the oral microbiota in new-onset 
rheumatoid arthritis. Arthritis Rheum. 64, 3083-3094 (2012). 

99. Koren, O. et al. Human oral, gut, and plaque microbiota in patients with 
atherosclerosis. Proc. Natl Acad. Sci. USA 108 (suppl. 1), 4592-4598 (2011). 

100. Yin, J. et al. Dysbiosis of gut microbiota with reduced trimethylamine-N-oxide 
level in patients with large-artery atherosclerotic stroke or transient ischemic 
attack. J. Am. Heart Assoc. 4, €002699 (2015). 

101.Tang, W. H. et a/. Gut microbiota-dependent trimethylamine N-oxide (TMAO) 
pathway contributes to both development of renal insufficiency and mortality 
risk in chronic kidney disease. Circ. Res. 116, 448-455 (2015). 

102.Xu, R., Wang, Q. & Li, L.A genome-wide systems analysis reveals strong link 
between colorectal cancer and trimethylamine N-oxide (TMAO), a gut microbial 
metabolite of dietary meat and fat. BMC Genomics 16 (suppl. 7), S4 (2015). 

103. Tang, W. H. et a/. Intestinal microbial metabolism of phosphatidylcholine and 
cardiovascular risk. N. Engl. J. Med. 368, 1575-1584 (2013). 

104. Zhu, W. et al. Gut microbial metabolite TMAO enhances platelet hyperreactivity 
and thrombosis risk. Cel/ 165, 111-124 (2016). 

105.Hsiao, E. Y., McBride, S. W., Chow, J., Mazmanian, S. K. & Patterson, P. 

H. Modeling an autism risk factor in mice leads to permanent immune 
dysregulation. Proc. Natl Acad. Sci. USA 109, 12776-12781 (2012). 

106.Liu, Z., DeSantis, T. Z., Andersen, G. L. & Knight, R. Accurate taxonomy 
assignments from 16S rRNA sequences produced by highly parallel 
pyrosequencers. Nucleic Acids Res. 36, e120 (2008). 

107.Gilbert, J. A. et al. Meeting report: the terabase metagenomics workshop and the 
vision of an Earth microbiome project. Stand. Genomic Sci. 3, 243-248 (2010). 

108.Lozupone, C. & Knight, R. UniFrac: a new phylogenetic method for comparing 
microbial communities. Appl. Environ. Microbiol. 71, 8228-8235 (2005). 

109. Quinn, R. A. et al. Microbial, host and xenobiotic diversity in the cystic fibrosis 
sputum metabolome. /SME J. 10, 1483-98 (2015). 

110.Ridlon, J. M., Kang, D. J., Hylemon, P. B. & Bajaj, J. S. Bile acids and the gut 

microbiome. Curr. Opin. Gastroenterol. 30, 332-338 (2014). 

111.Gill, S. R. et al. Metagenomic analysis of the human distal gut microbiome. 

Science 312, 1355-1359 (2006). 

This study provided the first metagenomic gene catalogue of the human gut. 
2.Qin, J. et al. A human gut microbial gene catalogue established by metagenomic 

sequencing. Nature 464, 59-65 (2010). 

113.Turnbaugh, P. J. et al. The human microbiome project. Nature 449, 8304-810 (2007). 

4.White, J. R., Nagarajan, N. & Pop, M. Statistical methods for detecting 

differentially abundant features in clinical metagenomic samples. PLoS Comput. 
Biol. 5, 1000352 (2009). 

.Love, M. |., Huber, W. & Anders, S. Moderated estimation of fold change and 

dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014). 

andal, S. et al. Analysis of composition of microbiomes: a novel method for 

studying microbial composition. Microb. Ecol. Health Dis. 26, 27663 (2015). 

.Knights, D., Costello, E. K. & Knight, R. Supervised classification of human 

microbiota. FEMS Microbiol. Rev. 35, 343-359 (2011). 

118.Ley, R. E. et a/. Obesity alters gut microbial ecology. Proc. Nat! Acad. Sci. USA 102, 

11070-11075 (2005). 

119.Vijay-Kumar, M. et al. Metabolic syndrome and altered gut microbiota in mice 

lacking Toll-like receptor 5. Science 328, 228-231 (2010). 

120.Parks, B. W. et a/. Genetic control of obesity and gut microbiota composition in 
response to high-fat, high-sucrose diet in mice. Cel! Metab. 17, 141-152 (2013). 

121.Yatsunenko, T. et al. Human gut microbiome viewed across age and geography. 
Nature 486, 222-227 (2012). 

122.Goodrich, J. K. et al. Human genetics shape the gut microbiome. Cel/ 159, 
789-799 (2014). 


Re 


a 


a 
ol 


a 
N 


Acknowledgements This work and the work in the authors’ laboratories that it 
describes was supported in part by awards from the US National Institutes of Health, 
the US Department of Energy, the US National Science Foundation, the Alfred P. 
Sloan Foundation, the Crohn’s and Colitis Foundation of America and the US Office of 
Naval Research. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial interests. Readers 
are welcome to comment on the online version of this paper at go.nature.com/28jf0c5. 
Correspondence should be addressed to R.K. (robknight@ucsd.edu). 


7 JULY 2016 | VOL 535 | NATURE | 103 


© 2016 Macmillan Publishers Limited. All rights reserved. 


ARTICLE 


doi:10.1038/nature18609 


Species-specific wiring for direction 
selectivity in the mammalian retina 


Huayu Ding!, Robert G. Smith, Alon Poleg-Polsky!, Jeffrey S. Diamond! & Kevin L. Briggman** 


Directionally tuned signalling in starburst amacrine cell (SAC) dendrites lies at the heart of the circuit that detects the 
direction of moving stimuli in the mammalian retina. The relative contributions of intrinsic cellular properties and 
network connectivity to SAC direction selectivity remain unclear. Here we present a detailed connectomic reconstruction 
of SAC circuitry in mouse retina and describe two previously unknown features of synapse distributions along SAC 
dendrites: input and output synapses are segregated, with inputs restricted to proximal dendrites; and the distribution 
of inhibitory inputs is fundamentally different from that observed in rabbit retina. An anatomically constrained SAC 
network model suggests that SAC-SAC wiring differences between mouse and rabbit retina underlie distinct contributions 
of synaptic inhibition to velocity and contrast tuning and receptive field structure. In particular, the model indicates 
that mouse connectivity enables SACs to encode lower linear velocities that account for smaller eye diameter, thereby 
conserving angular velocity tuning. These predictions are confirmed with calcium imaging of mouse SAC dendrites 


responding to directional stimuli. 


A thorough understanding of a neuronal circuit requires a detailed 
anatomical wiring diagram that includes the synaptic connectivity 
among the component neurons. Even ostensibly subtle connectivity 
differences during development or between species could underlie 
substantial changes in circuit behaviour. This is exemplified in the 
direction selectivity circuit in the mammalian retina, a model neural 
network that engages just a few well-characterized cell types to compute 
salient visual information. However, the detailed synaptic connectivity 
among these neurons, and circuitry differences between species, has 
not been completely described. 

Direction-selective ganglion cells (DSGCs) respond strongly to 
visual motion in one (preferred) direction but only weakly to motion 
in the opposite (null) direction!. Bipolar cells provide excitatory syn- 
aptic inputs to DSGCs and to densely arrayed SACs, which then inhibit 
DSGCs (Fig. 1a)*?. SAC dendrites oriented asymmetrically to a DSGC 
provide feedforward inhibitory input that establishes DSGC direc- 
tional tuning*®. SAC dendrites are themselves directionally selective 
and release the neurotransmitter GABA (-aminobutyric acid) from 
synaptic terminals at their tips preferentially in response to outward 
(centrifugal) compared to inward (centripetal) motion relative to 
their soma®*. Several mechanisms contribute to direction selectivity 
within individual SAC dendrites, but the relative importance of the 
mechanisms is unclear. Proposed intrinsic mechanisms include den- 
dritic morphology’, non-uniform chloride homeostasis!”, and active 
membrane conductances®"!. SAC direction selectivity may also rely 
on network interactions, such as spatially offset synaptic inputs from 
particular bipolar cell types!” and reciprocal inhibition between neigh- 
bouring SACs*!3-1°, 

Most anatomical analyses of SAC microcircuitry have been per- 
formed in rabbit retina. Sparse electron microscopy reconstructions 
in rabbit indicated that excitatory and inhibitory synaptic inputs occur 
along the entire length of SAC dendrites, whereas inhibitory synaptic 
outputs arise on the distal third”. We explored SAC connectivity in 
mouse retina using serial block-face scanning electron microscopy”. 
We discovered a previously unknown asymmetric distribution of 


inhibitory and excitatory input synapses onto ON and OFF mouse SAC 
dendrites that is fundamentally different from the connectivity in rabbit 
retina. We developed an anatomically constrained network model of 
mouse SAC connectivity that predicts new roles for synaptic inhibition 
in velocity and contrast tuning and receptive field structure in SACs. 
Finally, we confirmed these predictions by recording directionally 
tuned responses in mouse SAC dendrites. Our results indicate that the 
SAC network has adapted to meet the specific demands imposed by 
the mouse visual system. 


Synaptic inputs are spatially offset 

We annotated an ON-OFF DSGC within a conventionally stained serial 
block-face scanning electron microscopy volume (50 x 210 x 260 1m?) 
from an adult mouse retina (Extended Data Fig. 1a, b). Neurites form- 
ing conventional (inhibitory) synapses (Fig. 1b) onto this cell were 
back-traced to identify four SACs (2 ON, 2 OFF) located centrally in 
the data volume (Fig. 1c-e and Extended Data Fig. 1c—e). The mor- 
phology of each SAC was fully traced within the data volume and the 
locations of input and output synapses were annotated. As expected, 
output synapses arose along the distal third of SAC dendritic trees 
(Fig. 1f, g). Ribbon-type input synapses (Fig. 1b) from bipolar cells 
were distributed primarily along the proximal two-thirds of dendrites 
(Fig. 1f, g). Conventional synapses from amacrine cells (Fig. 1b) were 
restricted to the initial third of the dendritic trees (Fig. 1f, g). This 
proximal location of amacrine cell inputs differs from previous reports 
in rabbit retina that SACs receive reciprocal SAC inputs along their 
distal dendrites (Extended Data Fig. 1f)*!%'8, indicating that SAC 
connectivity is fundamentally different in mice and rabbits. Next, we 
identified cells that were presynaptic to the SACs. 


Bipolar and amacrine cell types presynaptic to SACs 

Recent analysis of contact area shared between different OFF bipolar 
cell types and OFF SACs suggested a ‘space-time wiring’ presynaptic 
delay model that supports SAC direction selectivity'”. In this model, 
different bipolar cell types exhibit distinct release kinetics!?~?!, and 


ISynaptic Physiology Section, National Institute of Neurological Disorders and Stroke, Bethesda, Maryland 20892, USA. @Department of Neuroscience, University of Pennsylvania, Philadelphia, 
Pennsylvania 19104, USA. Department of Biomedical Optics, Max Planck Institute for Medical Research, Heidelberg 69120, Germany. “Circuit Dynamics and Connectivity Unit, National Institute 


of Neurological Disorders and Stroke, Bethesda, Maryland 20892, USA. 


00 MONTH 2016 | VOL 000 | NATURE | 1 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 
eACinput b 


a 
cl i. a) ae 
BC 


OPL ' 
INL ° AC 


U 
r 
ON ‘OFF 


SAC AC 


Figure 1 | Synaptic connectivity of mouse 
SACs. a, Schematic diagram of the direction 
selectivity circuitry. AC, amacrine cells, BC, 
bipolar cells, ONL, outer nuclear layer, OPL, 
outer plexiform layer, INL, inner nuclear layer, 
IPL, inner plexiform layer, GCL, ganglion cell 
layer. b, Representative examples of a presynaptic 
SAC (black arrow, left image) contacting a 
postsynaptic SAC (white arrow, left image) and 

a presynaptic bipolar cell (black arrow, right 
image) forming a ribbon synapse with two 
postynaptic SACs (white arrows, right image). 

c, e, Distribution of excitatory (blue) and inhibitory 
(red) input synapses and output synapses (black) 
onto ON and OFF SACs. d, Horizontal view of 
ON and OFF SACs, whose somata reside in the 
GCL or INL, respectively. f, g, Histograms of 
radial distances from the soma for annotated 
synapses. Data pooled from n=2 ON and n=2 


aad o 
Ea o ~~! ve8”, ° OFF SACs. Scale bars, 1 }1m (b), 501m (c, e), 
2 re B fs. ¢ 25 41m (d). 
I s We 
ae i ae AN uN 
. 009 8 8 
a ae eat 5 
@ q F ° Aree” 
’ i LJ ° . ¢ 
bed 3 *s ¢ 
e? » 
f ON SACs g OFF SACs 
60 ; 60 
AC inputs AC inputs 
50 BC inputs 50 BC inputs 


SAC outputs 


Number of synapses 
ao 
i=) 

Number of synapses 
wo 
oO 


20 20 

10 10 

0 0 
0 50 100 150 200 0 50 


Radial distance from soma (um) 


sustained bipolar cells (for example, BC2) provide input more proxi- 
mally on SAC dendrites than do transient bipolar cells (for example, 
BC3a). Because our data set allowed us to positively identify 
synapses, we classified bipolar cell types providing input to ON and 
OFF SACs (Fig. 2); and we noted several differences compared to the 
contact-based analysis’. 

We found synapses onto the OFF SACs from all OFF bipolar cell 
types (BC1, BC2, BC3a, BC3b, and BC4; Fig. 2a, b and Extended Data 
Fig. 2) with most input from BC1, BC2, and BC3a (Fig. 2a, b). BC1 
and BC3a exhibited segregated radial distributions, potentially sup- 
porting a presynaptic space-time wiring model, but BC2 overlapped 
with both; this overlap, regardless of BC2 response kinetics, would 
presumably diminish direction selectivity generated in such a model. 
Space-time wiring may still support direction selectivity in OFF SACs, 
pending characterization of type-specific OFF bipolar cell response 
kinetics. Our data suggest that SAC dendrites may simply sample from 
the available bipolar cells at a particular depth in the inner plexiform 
layer (IPL) (Fig. 2c), regardless of bipolar cell release characteristics. 

If space-time wiring were essential for SAC direction selectivity, one 
would expect a similar connectivity pattern for ON SACs. Bipolar cell 
inputs to the ON SACs clustered into four subtypes, corresponding 
to BC7 and three BC5 subtypes (BC5o, BC5t, BC5i)”* (Figs 2d, e and 
Extended Data Fig. 3). We found that type BC7 primarily contacted 
proximal dendrites, whereas BCS inputs, collectively, were distributed 
more distally (Fig. 2d, e). The radial location of synapses correlated with 
their IPL depth (Fig. 2f). Segregated bipolar cell inputs to ON SACs 
could support a space-time direction selectivity mechanism, although 
type BC7 (which we show provide proximal inputs) exhibits transient 
light responses”, counter to the model’s requirements. 


2 | NATURE | VOL 000 | 00 MONTH 2016 


100 150 
Radial distance from soma (um) 


SAC outputs 


200 


Next, we analysed the sources of amacrine cell synapses onto the ON 
and OFF SACs (Fig. 3a, b and Extended Data Fig. 4). Most inputs origi- 
nated from neighbouring SACs, identified by their distinctive branching 
pattern and tight co-stratification with the postsynaptic SACs 
(Fig. 3a and Extended Data Fig. 4a, b). There was no directional prefer- 
ence in the absolute orientations of presynaptic SAC dendrites. Previous 
studies hypothesized that SAC direction selectivity could be enhanced 
if opposing (‘anti-parallel’) SAC dendrites preferentially made recip- 
rocal connections®"’. To test this idea, we measured the relative angle 
between connected presynaptic and postsynaptic dendrites (Fig. 3c and 
Extended Data Fig. 5a, b). The distributions of relative angles for both 
the ON and OFF SACs were significantly skewed towards anti-parallel 
(180°) wiring (Kolmogorov-Smirnov test, P=2 x 10~*°; Fig. 3c). We 
considered whether presynaptic SAC dendrites selectively connect to 
opposing dendrites or whether the relative angle distribution simply 
reflects the inter-soma spacing between SACs. We annotated locations 
where the distal third of presynaptic SAC dendrites passed within 1 jum 
of the postsynaptic SACs and measured the relative angles between 
dendrites at each proximity. The proximity-based relative angle 
distribution was not statistically significantly different from the distri- 
bution based on actual synaptic connectivity (Extended Data Fig. 5c, 
Kolmogorov-Smirnov test, P=0.18), indicating that the wiring arises 
primarily from the geometric arrangement of connected SACs. Relative 
angle was not correlated to the radial distance of each synapse from the 
respective postsynaptic SAC soma (Extended Data Fig. 5d). 

Not all inhibitory inputs came from neighbouring SACs. We anno- 
tated several apparent wide-field amacrine cells that contributed syn- 
apses specifically onto the most proximal dendrites of ON and OFF 
SACs (Extended Data Fig. 4c, d). Wide-field amacrine cells did not 


© 2016 Macmillan Publishers Limited. All rights reserved 


IPL depth (%) © Number of synapses OF 
IPL depth (%) "™® Number of synapses @ 


0 50 
Radial distance from soma (um) 


100 150 


co-stratify with SACs, but rather stratified close to the inner nuclear 
and ganglion cell layers, in contrast to a different population targeting 
bipolar cell axon terminals presynaptic to DSGCs*. We also found a 
few synapses from narrow-field amacrine cells, mostly onto ON SACs 
(Extended Data Fig. 4e)”°. Therefore, although most proximal amac- 
rine inputs originate from neighbouring SACs, additional inputs may 
selectively inhibit perisomatic compartments. 

We also quantified the number and types of postsynaptic targets of 
ON and OFF SAC branches terminating near the centre of the data 
volume (Extended Data Fig. 6). We traced postsynaptic cells until they 
could be identified unambiguously as a ganglion cell, SAC, wide-field 
amacrine cell or bipolar cell. Synapses were formed primarily onto gan- 
glion cells and SACs, with few outputs onto bipolar cells, consistent 
with findings that bipolar cell terminals are not directionally tuned”**. 
ON SACs devoted a higher fraction of outputs to ganglion cells than did 
OFF SACs, possibly because ON SACs provide inputs to both ON-OFF 
DSGCs and ON DSGCs. 


Proximal excitation enhances SAC direction selectivity 
Our anatomical data indicate that bipolar cell inputs are restricted to 
the proximal two-thirds of SAC dendrites and SAC inputs are restricted 
to the proximal third. Next, we combined computational modelling and 
physiological imaging to examine how this connectivity pattern affects 
response properties of SAC dendrites. 

We based a single-cell SAC model on an existing passive model’ and 
incorporated measured dendritic diameters and active conductances 
along the dendrites such that the dendrites and soma both preferred 
centrifugal motion (Extended Data Fig. 7 and Extended Data Table 1)°. 
We then constructed a network model comprising one central SAC and 
six surrounding SACs (Fig. 4a). SAC-SAC synapses were formed when 
a presynaptic dendrite came within a defined distance of a postsynaptic 
cell. The inter-soma distance (145 1m) was set to reproduce the relative 
angle distributions observed anatomically (Extended Data Fig. 5) and 
the radial distribution of inhibitory synapses (Fig. 4b, upper panel). 


a Presynaptic ON SACs 


GCL INL 
wf 


Figure 3 | Inhibitory inputs to mouse SACs. a, SAC dendrites 
presynaptic to an ON and OFF SAC, colour-coded by the orientation 

of the presynaptic soma relative to the synaptic contact. A total of 33% 
(n= 30) of OFF dendrites and 45% (n = 30) of ON dendrites traced back to 
somas within the data set, corresponding to inter-soma distances between 
connected SACs of 98.5 + 35.9|1m (OFF, mean +s.d.) and 113.44 37.0,1m 


Presynaptic OFF SACs 


ARTICLE 


Figure 2 | Bipolar cell inputs to mouse SACs. 
a, b, Location of BC synapses onto an OFF SAC, 
colour-coded by bipolar cell type (b). Grey dots 
indicate BC synapses that were not analysed. 

b, Total OFF bipolar cell synapses (n = 343) 

0 onto n = 2 OFF SACs versus radial distance 

: = 100 8° from soma. c, IPL depth of each synapse versus 
the radial distance relative to their soma. 

d-f, As in a—c for ON SACs, (n = 262) ON 
bipolar cell synapses onto n = 2 ON SACs. Grey 
line (e) indicates pooled inputs from all three 
type 5 bipolar cells. Scale bar, 50 jum. 


(0) 50 
Radial distance from soma (um) 


100 150 


We then measured the direction selectivity index (see Methods) at a 
distal dendritic location (the region of interest (ROI*), Fig. 4d) on the 
central SAC (Fig. 4a, boxed region). 

In response to moving bar stimuli, the ROI* preferred centrifugal 
motion compared to centripetal motion, as expected (Fig. 4d). During 
centrifugal motion, depolarization of the dendritic tips preceded inhi- 
bition from neighbouring SACs. During centripetal motion, inhibition 
preceded excitation and limited depolarization of the ROI*. We then 
modified the model to test whether the spatial separation between 
excitatory inputs and SAC outputs is important for direction selectivity. 
When bipolar cell inputs were uniformly distributed along SAC den- 
drites, thereby overlapping with outputs, the ROI* preferred centripetal 
over centrifugal motion (Fig. 4e). Bipolar cell inputs on distal tips 
increased surround inhibition during centrifugal motion and caused 
excitation to lead inhibition during centripetal motion, thereby reducing 
direction selectivity. This result suggests that restricting excitation to 
the proximal two-thirds of SAC dendrites establishes a temporal pattern 
of excitation and inhibition that enhances preference for centrifugal 
motion. 


Inhibition shapes velocity tuning 
When we simulated rabbit-like connectivity by increasing the inter-soma 
distances (200 jum) to generate distal SAC-SAC contacts (Fig. 4b, lower 
graph, 4c), the model still exhibited centrifugal preference. The most 
obvious distinction between the mouse and rabbit eye is a fivefold differ- 
ence in diameter (Extended Data Fig. 8a)”**°. Consequently, a 1° visual 
angle subtends 30 zm on the mouse retina and 150 1m on the rabbit ret- 
ina. Mouse and rabbit DSGCs respond to similar angular velocities?” 
(Extended Data Fig. 8c), suggesting that SACs in both species are also 
tuned to similar angular velocities. This translates to different linear 
velocities: 10°s~! motion corresponds to 1,500 j1m s ! across rabbit 
retina, but just 300 ums! across mouse retina (Extended Data Fig. 8b). 
Both the mouse and rabbit SAC models exhibited direction selectivity 
at linear velocities above 500 zms~' (Fig. 4g). At lower velocities, 


__100 (154) (217) 


50 


Presynaptic & 
synapses (% 


(9) (11) (18) (4) 


SAC WAC NAC 


OFF SAC synapses 
ON SAC ila 


0 30 60 90 120 150 180 
SAC-to-SAC relative angle (°) 


SAC WAC NAC 


o 


Number of synapses 


0 
30 


20 
10 


0 


(ON, mean + s.d.), consistent with the spacing of connected SACs based 
on paired recordings in adult mice®’. b, Input synapses originating from 
different amacrine cells. c, Histogram of relative angle (0) between each 
presynaptic and postsynaptic SAC dendrite for OFF (m= 217, black) and 
ON (n= 154, grey) SAC synapses. Scale bar, 50 um. NAC, narrow-field 
amacrine cells; WAC, wide-field amacrine cells. 


00 MONTH 2016 | VOL 000 | NATURE | 3 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a Mouse connectivity 


s 
wo 
iS] 


CF motion 
CP motion 
Percentage of synapses 
oO 


0 50 


50 100 150 200 


100 150 200 


Mouse AC inputs Rabbit conineciity 


Model AC inputs 
Mouse SAC outputs 


Rabbit AC inputs = 
Model AC inputs 
Rabbit SAC outputs 


Radial distance from soma (um) 


ds exc inputs 


,ROM 


inh inputs_\ Ror 
6-8-s-8 
— 
— 
ee ee 


t, <ffcp 
oO 
12% 
BS 
a 
£ = 
E ‘3 
= as 
3 0 ; 0% 7 
ro tytyt 4 tytets 0.8 = ttt, tytsts . tytyt, Oe tytets 
Time (s) Time (s) Time (s) 
g h CF cP i CF cP 
8 12 
Be -35 , 735 8 
: 4 
_06 55 fe 9 «758 “ 
O 0.4 Mouse 
0.2 Rabbit 4 4 


102 108 


4 
tot 
) 9 


Linear velocity (um s~1 3 


Null Preferred x 


30 100 500 


Figure 4 | Functional consequences of SAC network connectivity. 

a, c, Compartmental models of mouse (a) and rabbit (c) networks. 

b, Radial distributions of simulated synapses compared to anatomical 
reconstructions (rabbit data analysed from ref. 2). d, Schematic of mouse 
connectivity (top) and simulated responses (bottom) to centrifugal 

(CF) and centripetal (CP) bar stimuli relative to the location ROI*. Bar 
location at times t|-t¢ indicated by dashed grey lines. Voltage and calcium 
responses measured at the ROI*; synaptic conductances measured for the 
central SAC. e, As in d, but with bipolar cell inputs distributed uniformly 


however, direction selectivity in the rabbit model degraded because sur- 
round inhibition and central excitation did not overlap sufficiently in 
time to inhibit centripetal responses as strongly (Fig. 4i). The reduced 
direction selectivity at lower velocities is consistent with velocity tuning 
measured in rabbit DSGCs (Extended Data Fig. 8)*!. By contrast, the 
mouse model remained direction selective down to 100,1ms~?. The 
greater spatial overlap of synaptic inputs from neighbouring SACs and 
bipolar cells in mouse enabled inhibition to coincide with excitation 
at lower linear velocities during centripetal motion (Fig. 4h). Increasing 
SAC inter-soma distances to 250 1m, generating tip-to-tip connectivity, 
further shifted the tuning curve to higher velocities (Fig. 4g). 

We tested the prediction from the model by performing two-photon 
laser scanning microscopy* of dendritic calcium from mouse SACs 
filled with OGB1 in whole-mount retinas* (Fig. 4j). Bars of light were 
swept across SAC receptive fields in eight equally spaced directions at 
linear velocities ranging from 30-2,000 1m s~'; direction selectivity 


4 | NATURE | VOL 000 | 00 MONTH 2016 


Time (s 


Linear velocity (um s~') 


4 0 2 4 6 8 


Time (s) 


n= 11 varicosities 
1.0rp=9 


4 
6. oC 0 
) 


FO ie SO i i fi Zo hs 


ae] a a 
wari pet mil Syn) MN A pants 


0.0! in nm a1 nt revel 
10! 10? 108 
Linear velocity (um s~*) 


1,000 2,000 


along SAC dendrites. f, As in d, but incorporating rabbit-like connectivity. 
g, Simulated velocity tuning curves. The direction selectivity index 
calculated from [Ca?*] at the ROI*. h, i, Simulated responses at 200 ums” 
for mouse and rabbit models, respectively. j, Fluorescence image of an 
ON SAC filled with OGB1. k, Representative Ca?* transients measured 

at the varicosity highlighted in j in response to visual stimuli moving at 
five different velocities (300% contrast). 1, Velocity tuning of the direction 
selectivity index (mean +s.d.) in n= 41 SAC varicosities measured from 
n=3 ON SACs. Scale bars, 100 1m (a, ¢), 25 41m (j). 


i 


was calculated from calcium transients measured at individual distal 
varicosities. As the model predicted, mouse SACs remained direc- 
tionally selective down to at least 100 1m st (Fig. 4k, 1). These results 
suggest that SAC circuitry has adapted to conserve angular velocity 
tuning across species. 


SAC-SAC inhibition expands contrast range 
To encode naturalistic stimuli effectively, SACs must also remain direc- 
tionally selective over a wide contrast range**”», a feature predicted by 
our model (Fig. 5a, b). Simulations suggested that broad contrast tuning 
requires SAC-SAC inhibition: at high contrasts, blocking inhibition 
dramatically reduced direction selectivity in simulated SACs due to 
saturation of postsynaptic responses to both centrifugal and centripetal 
stimuli (Fig. 5c). 

We tested these predictions by imaging SAC dendritic responses to 
directional motion at different visual contrasts (Fig. 5d—g). Consistent 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


Figure 5 | Contrast dependence of SAC to 


ba - ae p d = f es et . : 0 um ae SAC inhibition. a—c, Contrast tuning curve 
B06 rs a of mouse network model in response to a bar 
ce . NY 5+ os 05 stimulus. Increasing contrast was simulated with 
0.0 : B et stronger bipolar cell depolarization. The maximal 
as e A acneh Gah Control 0.0. oo" — conductance of inhibitory synapses in the model 
b CF cp = +5R 0.0 05 1.0 00 05 1.0 was varied. d, Directional tuning of individual SAC 
5 a — 123 c Preferred Null 1.0 ‘ 10 varicosities. Vectors indicate preferred direction 
e as \ i fe s ie fare! and the direction selectivity index magnitude. 
> -55 0 &. g WA os nepttuneona tb 2 05 year ae Scale bars, 101m (upper), 0.5 direction selectivity 
@ , 42 . _ _ ° Q ts “a fe index (lower). e, Representative Ca”* transients 
Tou \ h\ pe 8 BO ook from individual varicosities under low (100%) 
34 > al pe 0.0 05 10 00 05 1.0 andhigh (300%) contrast (SR: SR95531). f, The 
ne ae 1 sec ee Bekeontrok direction selectivity index (DSI) of individual 
c CE cP. 7 = a g FS 200 Lew Gontrast varicosities for ON (left: n = 201 ROIs from n= 6 
a - % cela 8 460 ._ High contrast cells; right: n = 261 ROIs from n= 10 cells) and 
ee A z a And tien 6 ae a * OFF (left: n = 193 ROIs from n= 4 cells; right: 
0 a _ & 100 L n= 197 ROIs from n =9 cells) SACs. g, The 
4 4 2 pees & 50 direction selectivity index (mean + s.d.) following 
2 | | > Sere arene SOOT BO SR95531 application as a fraction of control (paired 
of = =o a ON OFF t-test, Bonferroni correction, *P < 0.001). 


with the model, SACs remained directionally selective over differ- 
ent contrast levels and blocking SAC-SAC inhibition with a GABA, 
receptor (GABA,R) antagonist, SR95531 (251M), significantly reduced 
direction selectivity, particularly in response to high-contrast stimuli 
(Fig. 5g). 


Inhibition shapes SAC receptive fields 

In rabbit retina, most SAC-SAC connections occur between distal den- 
drites (Extended Data Fig. 1f)’; consequently, direction selectivity for 
stimuli restricted to a SAC’s central receptive field relies primarily upon 
intrinsic dendritic conductances rather than network inhibition®’. In 
the mouse retina, we found that SACs receive SAC inputs exclusively 
on their proximal dendrites (Fig. 1), suggesting that direction selec- 
tivity within the central receptive field may rely on inhibition from 
neighbouring SACs. 

We explored this first in our mouse network model using a radially 
expanding or contracting (‘bullseye’) stimulus described previously 
(Fig. 6a, b)®. The model exhibited strong centrifugal direction selectivity 
in response to the bullseye stimulus with inhibition intact, because 
proximal inhibitory synapses became activated by centrally restricted 
stimuli (Fig. 6c). Removing inhibition reduced centrifugal direction 
selectivity over a range of simulated contrasts (Fig. 6c, d). We tested 
the prediction from the model by imaging dendritic calcium signals 
evoked by bullseye stimuli restricted to the SAC dendritic arbor 
(Fig. 6e). Blocking inhibition with SR95531 significantly reduced 


directional selectivity (Fig. 6f, g), as predicted. SR95531 may also 
influence presynaptic inhibition of bipolar cell terminals, potentially 
disrupting bipolar-cell-type-specific release kinetics. If this were the 
case, however, dendrite autonomous rabbit SAC directional selectivity 
should also be reduced by $R95531, in contrast to previous reports”. 


Discussion 

When reconstructing wiring diagrams, an important question is what 
level of detail is required to understand mechanistically how a neuronal 
circuit performs specific computations**”. Our results indicate that 
seemingly subtle differences in connectivity—such as whether cells 
receive inputs on proximal versus distal dendrites—can substantially 
influence neural coding and circuit behaviour. We found that segregating 
excitatory inputs from synaptic outputs along SAC dendrites helps 
establish strong centrifugal direction selectivity in a network model of 
SAC connectivity (Fig. 4e, also see ref. 38). More importantly, compar- 
ing wiring diagrams across species revealed a previously unrecognized 
connectivity difference in direction selectivity circuits of the mouse and 
rabbit retina (Fig. 1 and Extended Data Fig. 1). 

The two species exhibit comparable average SAC dendritic diame- 
ters and coverage factors’, suggesting that mouse and rabbit SAC 
networks theoretically could have been wired similarly. We found 
instead that the locus of presynaptic inhibition on SACs alters the linear 
velocity tuning of SAC direction selectivity to compensate for eye size 
difference and conserve angular velocity tuning across the two species 


Figure 6 | Receptive field structure of mouse SACs. a, b, The mouse 
network model (b) was activated with a bullseye stimulus (a) centred 

on and restricted to the diameter of the central SAC and expanded 

or contracted to elicit centrifugal or centripetal motion. c, Simulated 
dendritic [Ca”*] at the ROI* in response to centrifugal and centripetal 
bullseyes (6.7 Hz, 150 1m period, 0.05 AU contrast) with inhibition intact 
(black, grey) or blocked (orange, peach). d, The direction selectivity 


160 pS 
i 0.6 Ops 
& 0.4 
is 
3 02 eee as 
9 0 
2 3 4 5 6 7 
Contrast (AU) 
CF 5 g 1.0 
Gennes 125% 
Ort ama AF/F + 0.5 
A 
0) 
+SR 0 05 1.0 
Renee = 2-2 fee Control DSI 


index versus simulated contrast. e, Fluorescence image of OGB1-filled 
SAC. f, Representative dendritic Ca** transients recorded in response to 
centrifugal and centripetal bullseye stimuli (2 Hz, 140 1m period, 90% 
contrast). Responses from the ROI ine. g, Scatter plot of n =74 ROIs from 
n=5 ON SACs. SR95531 application significantly decreased the direction 
selectivity index from 0.46 + 0.24 (mean + s.d.) to 0.17 £0.14 (paired 
t-test, P=2 x 10713). Scale bars, 100 1m (b), 25 41m (e). 


00 MONTH 2016 | VOL 000 | NATURE | 5 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


(Fig. 4). Inhibition among SACs also extended their contrast tuning 
range (Fig. 5): removing inhibition reduced SAC directional selectivity 
at high-stimulus contrasts, potentially rendering postsynaptic DSGCs 
blind to directional motion. Proximal inhibition also altered the recep- 
tive field structure of mouse SACs compared to previous reports of 
rabbit SACs (Fig. 6)®”. 

Our simulations effectively guided our physiological experiments, 
but they underepresented the extensive connectivity of SACs, which 
actually receive inputs from dozens of neighbouring SACs (Fig. 3). 
The model also neglects inhibitory inputs to SACs from wide-field 
amacrine cells and narrow-field amacrine cells and detailed features 
of the presynaptic bipolar circuitry, important elements to incorporate 
in future simulations. Other visual stimulus features (for example, size, 
shape, spatial frequency) also remain to be explored. Nevertheless, the 
present study exemplifies how connectomic mapping, computational 
modelling and cellular physiology complement each other to provide 
new insights into neuronal circuit computations. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 28 December 2015; accepted 27 May 2016. 
Published online 22 June 2016. 


1. Barlow, H. B., Hill, R. M. & Levick, W. R. Retinal ganglion cells responding 
selectively to direction and speed of image motion in the rabbit. J. Physiol. 173, 
377-407 (1964). 

2. Famiglietti, E. V. Synaptic organization of starburst amacrine cells in rabbit 
retina: analysis of serial thin sections by electron microscopy and graphic 
reconstruction. J. Comp. Neurol. 309, 40-70 (1991). 

3. Vaney, D.1I., Collin, S. P.& Young, H. M. in Neurobiology of the Inner Retina 
(eds Weiler R. & Osborne N. N.) 157-168 (Springer, 1989). 

4.  Briggman, K. L., Helmstaedter, M. & Denk, W. Wiring specificity in the 
direction-selectivity circuit of the retina. Nature 471, 183-188 (2011). 

5. Wei, W., Hamby, A. M., Zhou, K. & Feller, M. B. Development of asymmetric 
inhibition underlying direction selectivity in the retina. Nature 469, 402-406 
(2011). 

6. Hausselt, S. E., Euler, T., Detwiler, P. B. & Denk, W. A dendrite-autonomous 
mechanism for direction selectivity in retinal starburst amacrine cells. 
PLoS Biol. 5, e185 (2007). 

7. Euler, T., Detwiler, P. B. & Denk, W. Directionally selective calcium signals in 
dendrites of starburst amacrine cells. Nature 418, 845-852 (2002). 

8. Lee, S. & Zhou, Z. J. The synaptic mechanism of direction selectivity in distal 
processes of starburst amacrine cells. Neuron 51, 787-799 (2006). 

9. Tukker, J. J., Taylor, W. R. & Smith, R. G. Direction selectivity in a model of the 
starburst amacrine cell. Vis. Neurosci. 21, 611-625 (2004). 

10. Gavrikov, K. E., Dmitriev, A. V., Keyser, K. T. & Mangel, S. C. Cation-chloride 
cotransporters mediate neural computation in the retina. Proc. Natl Acad. Sci. 
USA 100, 16047-16052 (2003). 

11. Oesch, N. W. & Taylor, W. R. Tetrodotoxin-resistant sodium channels contribute to 
directional responses in starburst amacrine cells. PLoS One 5, e12447 (2010). 

12. Kim, J. S. et al. Space-time wiring specificity supports direction selectivity in 
the retina. Nature 509, 331-336 (2014). 

13. Taylor, W. R. & Smith, R. G. The role of starburst amacrine cells in visual signal 
processing. Vis. Neurosci. 29, 73-81 (2012). 

14. Munch, T. A. & Werblin, F. S. Symmetric interactions within a homogeneous 
starburst cell network can lead to robust asymmetries in dendrites of starburst 
amacrine cells. J. Neurophysiol. 96, 471-477 (2006). 

5. Enciso, G. A. et al. A model of direction selectivity in the starburst amacrine cell 
network. J. Comput. Neurosci. 28, 567-578 (2010). 

6. Millar, T. J. & Morgan, |. G. Cholinergic amacrine cells in the rabbit retina synapse 
onto other cholinergic amacrine cells. Neurosci. Lett. 74, 281-285 (1987). 

7. Denk, W. & Horstmann, H. Serial block-face scanning electron microscopy to 
reconstruct three-dimensional tissue nanostructure. PLoS Biol. 2, e329 (2004). 

8. Dacheux, R. F., Chimento, M. F. & Amthor, F. R. Synaptic input to the on-off 
directionally selective ganglion cell in the rabbit retina. J. Comp. Neurol. 456, 
267-278 (2003). 

9. Roska, B. & Werblin, F. Vertical interactions across ten parallel, stacked 
representations in the mammalian retina. Nature 410, 583-587 (2001). 

20. Baden, T., Berens, P, Bethge, M. & Euler, T. Spikes in mammalian bipolar cells 

support temporal layering of the inner retina. Curr. Biol. 23, 48-52 (2013). 


6 | NATURE | VOL 000 | 00 MONTH 2016 


21. Borghuis, B. G., Marvin, J. S., Looger, L. L. & Demb, J. B. Two-photon imaging of 
nonlinear glutamate release dynamics at bipolar cell synapses in the mouse 
retina. J. Neurosci. 33, 10972-10985 (2013). 

22. Greene, M. J., Kim, J. S. & Seung, H. S. Analogous convergence of sustained 
and transient inputs in parallel on and off pathways for retinal motion 
computation. Cell Rep. 14, 1892-1900 (2016). 

23. Ichinose, T., Fyk-Kolodziej, B. & Cohn, J. Roles of ON cone bipolar cell 
subtypes in temporal coding in the mouse retina. J. Neurosci. 34, 
8761-8771 (2014). 

24. Hoggarth, A. et al. Specific wiring of distinct amacrine cells in the directionally 
selective retinal circuit permits independent coding of direction and size. 
Neuron 86, 276-291 (2015). 

25. Ishii, T. & Kaneda, M. ON-pathway-dominant glycinergic regulation of 

cholinergic amacrine cells in the mouse retina. J. Physiol. 592, 4235-4245 

(2014). 

26. Park, S. J. H., Kim, l.-J., Looger, L. L., Demb, J. B. & Borghuis, B. G. Excitatory 

synaptic inputs to mouse On-Off direction-selective retinal ganglion cells lack 

direction tuning. J. Neurosci. 34, 3976-3981 (2014). 

27. Yonehara, K. et al. The first stage of cardinal direction selectivity is localized to 

he dendrites of retinal ganglion cells. Neuron 79, 1078-1085 (2013). 

28. Chen, M., Lee, S., Park, S. J., Looger, L. L. & Zhou, Z. J. Receptive field properties 

of bipolar cell axon terminals in direction-selective sublaminas of the mouse 

retina. J. Neurophysiol. 112, 1950-1962 (2014). 

29. Bozkir, G., Bozkir, M., Dogan, H., Aycan, K. & Giller, B. Measurements of axial 

ength and radius of corneal curvature in the rabbit eye. Acta Med. Okayama 

51, 9-11 (1997). 

30. Park, H. et al. Assessment of axial length measurements in mouse eyes. 
Optometry Vision Sci. 89, 296-303 (2012). 

31. Chan, Y. C. & Chiao, C. C. Effect of visual experience on the maturation of 
ON-OFF direction selective ganglion cells in the rabbit retina. Vision Res. 48, 
2466-2475 (2008). 

32. Weng, S., Sun, W. & He, S. Identification of ON-OFF direction-selective ganglion 
cells in the mouse retina. J. Physiol. (Lond.) 562, 915-923 (2005). 

33. Denk, W., Strickler, J. H. & Webb, W. W. Two-photon laser scanning fluorescence 

microscopy. Science 248, 73-76 (1990). 

34. Euler, T. et a/. Eyecup scope—optical recordings of light stimulus-evoked 

luorescence signals in the retina. Pflugers Arch. 457, 1393-1414 (2009). 

35. Grzywacz, N. M. & Amthor, F. R. Robust directional computation in On-Off 

directionally selective ganglion cells of rabbit retina. Vis. Neurosci. 24, 647-661 

(2007). 

36. Morgan, J. L. & Lichtman, J. W. Why not connectomics? Nature Methods 10, 

494-500 (2013). 

37. Denk, W., Briggman, K. L. & Helmstaedter, M. Structural neurobiology: missing 

ink to a mechanistic understanding of neural computation. Nat. Rev. Neurosci. 
13, 351-358 (2012). 

38. Vlasits, A. L. et al. A role for synaptic input distribution in a dendritic 
computation of motion direction in the retina. Neuron 89, 1317-1330 
(2016). 

39. Vaney, D. |. ‘Coronate’ amacrine cells in the rabbit retina have the ‘starburst’ 
dendritic morphology. Proc. R. Soc. Lond. B. 220, 501-508 (1984). 

40. Tauchi, M. & Masland, R. H. The shape and arrangement of the cholinergic 
neurons in the rabbit retina. Proc. R. Soc. Lond. B. 223, 101-119 (1984). 

41. Pérez De Sevilla Miller, L., Shelley, J. & Weiler, R. Displaced amacrine cells of 
the mouse retina. J. Comp. Neurol. 505, 177-189 (2007). 

42. Keeley, P. W., Whitney, |. E., Raven, M. A. & Reese, B. E. Dendritic spread and 
functional coverage of starburst amacrine cells. J. Comp. Neurol. 505, 
539-546 (2007). 

43. Kostadinov, D. & Sanes, J. R. Protocadherin-dependent dendritic self-avoidance 
regulates neural connectivity and circuit function. eLife 4, (2015). 


Acknowledgements We thank W. Denk for supporting the collection of the 
serial block-face scanning electron microscopy data in his laboratory. This work 
was supported by NIH grants EY016607 and EY022070 (RGS), by the NINDS 
Intramural Research Program (NSO03145; J.S.D.) and (NSO03133; K.L.B.), the 
Max-Planck Society (K.L.B.), and the Pew Charitable Trusts (K.L.B.). 


Author Contributions H.D., R.G.S. and K.L.B. collected and analysed data; H.D., 
R.G.S., A.P.-P., J.S.D., and K.L.B. designed the study and wrote the paper. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of the 
paper. Correspondence and requests for materials should be addressed to 
K.L.B. (kevin.briggman@nih.gov). 


Reviewer Information Nature thanks G. Knott and the other anonymous 
reviewer(s) for their contribution to the peer review of this work. 


© 2016 Macmillan Publishers Limited. All rights reserved 


METHODS 


No statistical methods were used to predetermine sample size. All n values refer to 
biological replicates. The experiments were not randomized. The investigators were 
not blinded to allocation during experiments and outcome assessment. 

EM tissue preparation. An adult wild-type (C57BL/6) mouse (postnatal day 30 
(P30)) was anaesthetized with isoflurane (Baxter) inhalation and killed by cer- 
vical dislocation. The eyes were enucleated and transferred to a dish containing 
carboxygenated room-temperature saline, in which the retinas were dissected. 
All procedures were approved by the local animal care committee and were in 
accordance with the law of animal experimentation issued by the German Federal 
Government. We used a commercially available saline (Biometra) that was sup- 
plemented with 0.5 mM 1-glutamine and carboxygenated (95% O3/5% COz). We 
hemisected the retina and mounted it on filter paper. The retina was fixed in a solu- 
tion containing 0.1 M cacodylate buffer, 4% sucrose and 2% glutaraldehyde, pH 7.2 
(Serva). The tissue was fixed for 2h at room temperature and then rinsed in 0.1 M 
cacodylate buffer plus 4% sucrose overnight. A 1 x 1mm? region of the retina, 
approximately halfway between the optic disk and the peripheral edge of the retina, 
was then excised. The tissue was then stained in a solution containing 1% osmium 
tetroxide, 1.5% potassium ferrocyanide, and 0.15 M cacodylate buffer for 2h at 
room temperature. The osmium stain was amplified with 1% thiocarbohydrazide 
(Lh at 50°C), and 2% osmium tetroxide (1h at room temperature). The tissue was 
then stained with 2% aqueous uranyl acetate for 12h at room temperature and 
lead aspartate for 12h at room temperature. The tissue was dehydrated through an 
ethanol series (70%, 90%, 100%), transferred to propylene oxide, infiltrated with 
50%/50% propylene oxide/Epon Hard, and then 100% Epon Hard. The block was 
cured at 60°C for 24h. 

Serial block-face scanning electron microscopy acquisition. The retina (k0725) 
was cut out of the flat-embedding blocks and re-embedded in Epon Hard, on alu- 
minium stubs for serial block-face scanning electron microscopy, with the retinal 
plane vertical. The samples were then trimmed to a block face of ~200|1m wide 
and ~400|1m long. The samples were imaged in a scanning electron microscope 
with a field-emission cathode (QuantaFEG 200, FEI Company). Back-scattered 
electrons were detected using a custom-designed detector based on a special 
silicon diode (AXUV, International Radiation Detectors) combined with a 
custom-built current amplifier. The incident electron beam had an energy of 2.0 keV 
and a current of ~110 pA. Images were acquired with a pixel dwell time of 2.5 1s 
and size of 13.2nm x 13.2nm which corresponds to a dose of about 10 electrons 
pernm?. Imaging was performed at high vacuum, with the sides of the block evap- 
oration-coated with a 100-200 nm thick layer of gold. The electron microscope 
was equipped with a custom-made microtome designed by W. Denk that was 
previously used to collect retinal serial block-face scanning electron microscopy 
data*“*, The section thickness was set to 26nm. 10,112 consecutive block faces were 
imaged, resulting in aligned data volumes of 4,992 x 16,000 x 10,112 voxels (1 x 5 
mosaic of 3,584 x 3,094 images), corresponding to an approximate spatial volume 
of 50 x 210 x 260j1m°. The edges of neighbouring mosaic images overlapped by 
~1m. The cutting quality degraded during the course of the experiment, mean- 
ing the images in the first half of the data volume (approximately the first 5,000 
slices) are of higher quality than the second half of the volume. Nevertheless, thin 
neurites could be manually annotated throughout the volume. The imaged region 
spanned the inner plexiform layer of the retina and included the ganglion cell 
layer and part of the inner nuclear layer. Cross-correlation-derived shift vectors 
between neighbouring mosaic images and consecutive slices were used for a global 
least-squares fit across all shift vectors to align the data sets off-line to subpixel pre- 
cision by Fourier shift-based interpolation. The data sets were then split into cubes 
(128 x 128 x 128 voxels) for viewing in KNOSSOS (http://www. knossostool.org). 
Skeleton tracing and contact annotation. Skeletons were traced using KNOSSOS 
and consisted of nodes and connections between them. Nodes were placed approx- 
imately every 250 nm. Synapses were manually identified and annotated within 
Knossos. All analyses of skeletons were performed using MATLAB (MathWorks). 
Modelling. We constructed models of an individual SAC and a network of 7 SACs 
using the simulation language Neuron-C*. We digitized a SAC morphology from 
a confocal stack of a labelled SAC, but included a multiplicative ‘diameter factor’ 
set for each dendritic region based on the dendritic diameters measured from the 
electron microscopy reconstructions (Extended Data Fig. 7). The SAC network was 
assembled with an algorithm that synaptically interconnected the SAC dendrites 
based on their location and orientation. Each SAC typically made a total of 120-250 
inhibitory synapses onto its neighbours. The central SAC received about twice the 
number of inhibitory synapses as the surrounding SACs because of the ‘edge effect. 
Therefore, to achieve a balance between inhibition in the central SAC and its 6 
surrounding SACs, we reduced the conductance of the surround—central inhibitory 
synapses by 50%. BCs were created in a semi-random pattern and were connected 
to SACs with ribbon synapses if they were within a criterion distance. Synapses were 
modelled as Ca**-driven neurotransmitter release that bound to a postsynaptic 


ARTICLE 


channel defined by a ligand-activated Markov sequential-state machine****. The 
excitatory conductances were typically set to 230 pS and inhibitory conductances 
were typically 80-160 pS. Membrane ion channels were defined by a voltage-gated 
Markov state machine and were placed at densities specified for each region of the 
cell. See Extended Data Table 1 for biophysical parameters. 

The contrast of the stimulus presented to the SAC models was achieved by 
varying the strength of excitatory input from bipolar cells. This was accomplished 
by voltage-clamping a presynaptic compartment that represented each bipolar cell 
according to the spatiotemporal pattern of the stimulus. The presynaptic holding 
potential in the bipolar cells was just above the threshold for synaptic release, 
typically approximately —45 mV. 

The synaptic connectivity of the SAC output synapses was set automatically by 
an algorithm based on the orientation of presynaptic and possible candidates for 
the postsynaptic dendrite. When the orientations of both dendrites were within a 
specific angular range, a synaptic connection was made. This synaptic placement 
depended on several other criteria, for example, whether the presynaptic point 
fit within the allowable spacing and radial distribution on the presynaptic den- 
drite, and also whether the closest point on the postsynaptic dendrite was within 
a specified distance. The orientations were computed as the absolute angle from 
the prospective presynaptic point on the distal dendrite to the soma. 

Direction selectivity indices were calculated based on the calcium concen- 
tration at a location along a central SAC dendrite using the following equation: 
DSI=(PD-ND)/PD, where PD is the response in the centrifugal direction and 
ND is the response in the centripetal direction. 

Models were run on an array of 3.2GHz AMD Opteron CPUs interconnected 
by Gigabit ethernet, with a total of 220 CPU cores. Simulations of the 7-SAC model 
took 4-48 h, depending on the model complexity and duration of simulated time. 
The simulations were run on the Mosix parallel distributed task system under the 
Linux operating system. 

Physiological recordings: tissue and calcium-indicator loading. All physiological 
animal procedures were conducted in accordance with US National Institutes of 
Health guidelines, as approved by the National Institute of Neurological Disorders 
and Stroke Animal Care and Use Committee (ASP 1361). Both male and female 
adult (P30-P60) ChAT-tdTomato mice were used in the experiments (Jackson 
Laboratory). The mice were anaesthetized with isoflurane (Baxter) inhalation 
and killed by cervical dislocation. Retinas were isolated and all subsequent proce- 
dures were performed at room temperature in Ames media (Sigma) equilibrated 
with 95% O2/5% COs. Sharp electrodes were pulled on a P-97 Micropipette Puller 
(Sutter) with a resistance of 100-150 MOhms. Iontophoresis of Oregon Green 488 
BAPTA-1 (OGB1, Life Technologies) into single cells was achieved by applying the 
buzz function in MultiClamp 700B software at 50 ms pulses (Molecular Devices) 
while the electrode filled with OGB1 (15 mM in water) was on the cell membrane. 
Pipettes were withdrawn as soon as cell bodies began to fill, and cells were left to 
recover for 20-30 min before imaging. To block inhibition, the GABA, receptor 
antagonist SR95531 (254M, Tocris) was added to the extracellular medium. 
Physiological recordings: two-photon microscopy. For two-photon imaging, we 
used a customized microscope (Sutter Movable Objective Microscope), controlled 
by ScanImage”’, equipped with through-the-objective light stimulation*! and 
two detection channels for fluorescence imaging (green, BP 500-540, and red, BP 
575-640; Chroma/Thorlabs). The excitation source was a mode-locked Ti/sapphire 
laser (Chameleon, Coherent) tuned to 920nm. The microscope was used to 
simultaneously visualize ChAT-tdTomato-labelled SACs for single-cell targeting 
(red channel) and to monitor calcium activity reflected by OGB1 fluorescence 
changes (green channel). During functional imaging, the scan parameters were 
256 x 100 pixels at 10 Hz frame rate. Scanning was triggered by the light stimulation. 
Field of view during acquisition was 80}1m x 801m. 

Physiological recordings: light stimulation. Light stimulation was generated by 
custom-written code in Igor software (Wavemetrics) and 4D Workshop 4 IDE 
(4D Systems) to control an LCD mask in front of a collimated LED (405 nm, 
Thorlabs) with a bandpass filter (BP 405, Thorlabs). The stimuli were projected 
onto the retina through the objective lens (XLUMPlanFL 20 x 0.95 NA water- 
immersion, Olympus). Stimulus contrast varied between 100-300%, with the 
300% stimulus intensity at ~25 x 10° photonss~ "jum? on a background inten- 
sity of ~6 x 10° photonss~!,1m~*. For the bar stimulus, the bar (400 x 400 1m) 
moved in one of eight evenly spaced directions at a range of velocities between 
0.03-I1mms_ |. The bullseye stimulus was configured as previously described®. 
Each stimulus was repeated 3-5 times and responses were averaged. 
Calcium-imaging data analysis. Image stacks were analysed using custom Igor 
(Wavemetrics) functions. Image segmentation was performed by simple threshold- 
ing and ROIs are selected as varicosities along dendrites. Response to each stimulus 
was calculated as the average AF/F during 1s after stimulus onset; baseline was 
determined by measuring the fluorescence before the stimulus. The responses 
were averaged across stimulus presentations. The direction selectivity index was 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


calculated by (PD—ND)/PD, where ND is the null (or centripetal) and PD is the 
preferred (or centrifugal) response. 

Statistical analyses. We included as much of the raw anatomy data as practical in 
the figures, including neuron and synapse distributions and spatial locations. The 
identities of neurons presynaptic to SACs were, by definition, blind to the anno- 
tator before skeletonization. No reconstructed neurons were excluded from the 
analysis. For comparing relative angle distributions, we used the non-parametric 
Kolmogorov-Smirnov test. For dendritic calcium experiments incorporating phar- 
macology, all measurements were paired (that is, responses at a ROI are reported 
both before and after drug application). The number of recorded cells was selected 
to provide typically hundreds of ROIs for comparison and paired t-tests were 
used to assess statistical significance. All samples sizes and statistical test results 
are reported in the figure legends. Statistical tests were performed in MATLAB 
or GraphPad. 


Code availability. The Neuron-C simulation language that generated the models 
described above is available at: ftp://retina.anatomy.upenn.edu/pub/rob/nc.tgz. 
Included in this distribution is the realistic SAC morphology, the ‘retsim retinal 
circuit simulator that generated the models, and the ‘rsbac_stim_plots_vel’ script 
that ran multiple model jobs in parallel. 


44. Helmstaedter, M. et a/. Connectomic reconstruction of the inner plexiform layer 
in the mouse retina. Nature 500, 168-174 (2013). 

45. Smith, R. G. NeuronC: a computational language for investigating functional 
architecture of neural circuits. J. Neurosci. Methods 43, 83-108 (1992). 

46. Schachter, M. J., Oesch, N., Smith, R. G. & Taylor, W. R. Dendritic spikes amplify 
the synaptic signal to enhance detection of motion in a simulation of the 
direction-selective ganglion cell. PLOS Comput. Biol. 6, e1000899 (2010). 

47. Pologruto, T. A. Sabatini, B. L. & Svoboda, K. Scanlmage: flexible software for 
operating laser scanning microscopes. Biomed. Eng. Online 2, 13 (2003). 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


50 um 


d 
mouse ON SAC #2 GCL INL mouse OFF SAC #2 


rabbit connectivity 


AC inputs 
BC inputs 
SAC outputs 


# of synapses 


= 
wm oO WM 


% 50 100 150 200 
radial distance from soma (um) 


Extended Data Figure 1 | EM data set, additional SAC reconstructions annotated synapses locations. f, Annotation of the radial distribution of 
and rabbit connectivity. a, Conventionally stained serial block-face input and output synapses to and from approximately one-half of an OFF 
scanning electron microscopy volume of a mouse retina. b, Reconstructed SAC dendritic arbor in rabbit retina. Data analysed from fig. 15 in ref. 2. 
ON-OFF DSGC. c-e, A second reconstructed ON and OFF SAC with 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a 
S 10} 
a 
we 
uw 
3 
S 
2s 

0) 
0 
d 


types 1/2 


0.5 


2 
@ 


° 
a 


S 
N 


IPL width 75th—25th percentile (%) 
° 2 
a B 


ee 
e 
+ 7 type 2 
e. @ 
e 
e 
eee 
e 
type 1 
e ° 
i a 
Ls eo 
ee 
ee 
e 
e 
i} oe 
(0) 500 1000 1500 2000 


convex hull area (um?) 


type 1 


type 2 


900 


800 


N 
i=) 
oO 


an 
So 
o 


e 
type 3a e a 


eal 
fo) 
oO 


s 
ro) 
S 


convex hull area (tm?) 


04 0.45 


0.5 


IPL depth 95th percentile (%) 


type 3a 


type #syn/cell (# cells) 


Extended Data Figure 2 | Classification of OFF bipolar cells. a, Types 
1/2 and types 3/4 separated by IPL depth. b, Types 1 and 2 separate by 
stratification width and axonal arborization area (convex hull). c, Types 3a, 
3b and 4 separate by stratification depth and axonal arborization area. 


type 3b 


a 


OFF SAC #2 


IPL depth (%) 


d, Mosaic patterns and stratification profiles of OFF bipolar cells. e, The 


© 2016 Macmillan Publishers Limited. All rights reserved 


number of synapses (mean 4 


type 4 


50 100 
radial distance from soma (tum) 


150 


t s.d.) each bipolar cell, by type, formed with 
each SAC. f, Location of bipolar cell synapses onto a second OFF SAC, 
colour-coded by bipolar cell type. g, The IPL depth of each synapse versus 
the radial distance relative to the soma. 


a 
J ~0.06} # -peete : 
a ‘i 
L a Bo a 
3S -01 . é rig e e 
vn — eo. e t St 
ig = 0.12} . 8. OPP 
5 6 & . —_ 
2 % —0.14) e é . 
a 4 2 -0.16} type Si e 
(= e 
2 8 -0.18} 
0 ; a 
055 0.6 065 07 075 08 0.85 gs 702) 
85th percentile IPL depth (%) 9 022 > 
0.05 0.1 0.15 
IPL width 75th—20th percentile (%) 
c 
type7 type 5i 
type 5t type 50 
d 


type #syn/cell (# cells) 


2.0 + 1.5 (n=10) 
2.3 + 1.5 (n=18) 


2.1 + 1.4 (n=12) 
3.5 + 3.7 (n=24) 


Extended Data Figure 3 | Classification of ON bipolar cells. a, Type 5 
and type 7 biploar cells separated by IPL depth. b, Types 5o (outer), 

5t (thick) and 5i (inner) further subdivide based on IPL depth and 
stratification width. c, Mosaic patterns and stratification profiles of ON 


ARTICLE 


50 TOO 
radial distance from soma (um) 


150 


bipolar cells. d, Summary of the number of synapses (mean + s.d.) each 
bipolar cell, by type, formed with each SAC. e, Location of bipolar cell 
synapses onto a second ON SAC, colour-coded by bipolar cell type. f, The 
IPL depth of each synapse versus the radial distance relative to the soma. 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


Extended Data Figure 4 | Amacrine cell types presynaptic to SACs. a, b, SACs presynaptic to the second pair of mouse SACs colour-coded by absolute 
orientation. c, d, Wide-field amacrine cells presynaptic to SACs. e, Narrow-field amacrine cells presynaptic to ON SACs. 


© 2016 Macmillan Publishers Limited. All rights reserved 


a 


ARTICLE 


presynaptic SAC 


€ postsynaptic SAC 


soma 


Oo 30 60 90 


120 


150 180 


6, relative angle (°) 


relative angles, ON SAC #1 


1 ~¢ © B® 


c 
1.0 
OFF SAC synapses F, 
2 0s} 
= 
oOo 
2 
© 0.6 
a 
s 
3 0.4 
=| 
E 
a 
° 0.24 
0.0530 60 90 120 150 180 
EES 


relative angle (°) 


Extended Data Figure 5 | Relative angles between presynaptic 

and postsynaptic SAC dendrites. a, Schematic of the relative angle 
measurement: parallel wiring = 0°, anti-parallel wiring = 180°. 

b, Locations of SAC input synapses colour-coded by relative angle. Grey 
locations indicate AC synapses that were not analysed. c, Cumulative 


GCL 


relative angle (°) 


relative angles, OFF SAC #1 


) 


INL 


OFF synapses 
180+ )N synapse 
60} 4 . 

30}-* F 
? 
(0) aed 
0 60 80 100 


radial distance from postsynaptic soma (um) 


distributions of the relative angles between each presynaptic and 
postsynaptic OFF SAC dendrite for synapses (black) and proximities 
(grey). Dashed line indicates a uniform distribution. d, Relative angle 
for each synapse was uncorrelated with the radial distance from the 
postsynaptic somas (r= 0.07, P=0.16). Scale bar, 50m. 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a b c 
100 

z 

g ec 

SY & ge 

a ; eo 

3 8 Ss 

8 gw e $225 e 3 

co} 2 e > a @ 

* it .$° See eI A 
s 
of - . 
GC SACWACBC GC SACWACBC g goo? 
| ° 
ON OFF 
postsynaptic cell type 

Extended Data Figure 6 | Identities of neurons postsynaptic to SAC amacrine cells (WAC) (green). b, Locations of 83 annotated 
output synapses. a, Percentage of output synapses formed with different output synapses on 1 ON SAC dendrite fragment. c, Locations of 110 
postsynaptic cell types, colour-coded by postsynaptic cell class: ganglion annotated output synapses on 2 OFF SAC dendrite fragments. Scale bar, 
cells (GC) (blue), SACs (red), bipolar cells (BC) (cyan), and wide-field 50m. 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a b 
1500 ON 
E ‘ . 
© 1000 ? : ‘ —> Lf 
g + ‘ + : 
& - 2. 
BS) ‘ + B by WS 
Bot? ¢ fa 3: | ~ 
E + ¢ ¢€ &® &€ f {\ 
3 + oF 8 $F 3 @ F 
0 4. 4. 
0 50 100 — 


radial distance (um) 


° 
Qa 


CF cP 


= 8 S 
Ss 
S S 
i] 
[o} = 
2 40 
2 
5 -60 & 
= 
1 n n ni —60 4 4 n 4 n n 
0 0.02 0.04 0.06 0.08 0 0.05 0.1 0.15 0.2 0.25 0.3 
time (s) time (s) 
Extended Data Figure 7 | Single SAC model. a, Dendrite diameters distal dendrite (dashed line) voltage time series in response to an annulus 
sampled from an ON SAC (grey) and an OFF SAC (black) at different moving centrifugally or centripetally. The addition of active conductances 
radial distances from their respective somas. b, Single SAC morphology to SAC dendrites (see Extended Data Table 1) rendered somatic voltage 
used in all simulations. c, Somatic voltage clamp simulation showed poor recordings directionally selective for centrifugal compared to centripetal 
space clamp of even proximal dendrites. Voltage traces measured at a stimulation, consistent with electrophysiological measurements. 


different distances (20-150 ,1m) from the soma. d, Somatic (solid line) and —_ Scale bar, 501m. 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a 
3mm 
| axial 
diameter 
15mm 
axial mouse eye 
diameter 
retina 
rabbit eye 
b 
2 10 
& 
2 08 fl as ge - 
c - No? 
2 06 er 
= 2 
g 0.4 P “ 
o 
E 0.2 7 77 fabbit Chan & Chiao (2008), figure 2F 
¢ - — mouse Weng et al. (2005) converted 
0.0 5—= 3 a 
10 10 10 
linear velocity (tm/s) 
c 
2 1.0 
a 
& 
wv 0.8 N : 
s a7 So 
Poe - 7 
= 2 
- 0.4 7 
go / 
cI / 
E 0.2 2 == rabbit Chan & Chiao (2008) converted 
c ae — mouse Weng et al. (2005), figure 1D 
0.0 + : 
10° 10! 107 


angular velocity (deg/s) 


Extended Data Figure 8 | Velocity tuning of rabbit and mouse direction 
selectivity circuits. a, Schematic of the difference in axial diameters and 
subtended angle on the retina of rabbit and mouse eyes. b, Linear velocity 
tuning curves from rabbit and mouse ON-OFF DSGCs. c, Angular 
velocity tuning curves from rabbit and mouse ON-OFF DSGCs. Data 
analysed from fig. 2F of ref. 31 and fig. 1D of ref. 32. 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


Extended Data Table 1 | Table of biophysical parameters used in model SACs 


a 

Biophysical parameters for SAC model 

Rm (Q-cm?) 10,000 

Ri (Q-cm?) 75 

NaV1.8 channel density (S/cm?) soma: 0 
proximal 1/3: te) 
medial 1/3: 3e? 
distal 1/3: 3e3 

Kdr channel density (S/cm?) soma 3e? 
proximal 1/3: 2e3 
medial 1/3: 2e? 
distal 1/3: 2e3 

L-type Ca2+ channel density (S/cm?) soma 0 
proximal 1/3: 0 
medial 1/3: 1e3 
distal 1/3: 1e3 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


doi:10.1038/nature18590 


Pore-forming activity and structural 
autoinhibition of the gasdermin family 


Jingjin Ding!?, Kun Wang’, Wang Liu’, Yang She!*, Qi Sun’, Jianjin Shi*, Hanzi Sun*, Da-Cheng Wang!? & Feng Shao!?4 


Inflammatory caspases cleave the gasdermin D (GSDMD) protein to trigger pyroptosis, a lytic form of cell death that 
is crucial for immune defences and diseases. GSDMD contains a functionally important gasdermin-N domain that is 
shared in the gasdermin family. The functional mechanism of action of gasdermin proteins is unknown. Here we show 
that the gasdermin-N domains of the gasdermin proteins GSDMD, GSDMA3 and GSDMA can bind membrane lipids, 
phosphoinositides and cardiolipin, and exhibit membrane-disrupting cytotoxicity in mammalian cells and artificially 
transformed bacteria. Gasdermin-N moved to the plasma membrane during pyroptosis. Purified gasdermin-N efficiently 
lysed phosphoinositide /cardiolipin-containing liposomes and formed pores on membranes made of artificial or natural 
phospholipid mixtures. Most gasdermin pores had an inner diameter of 10-14 nm and contained 16 symmetric protomers. 
The crystal structure of GSDMA3 showed an autoinhibited two- domain architecture that is conserved in the gasdermin 
family. Structure-guided mutagenesis demonstrated that the liposome-leakage and pore-forming activities of the 
gasdermin-N domain are required for pyroptosis. These findings reveal the mechanism for pyroptosis and provide insights 


into the roles of the gasdermin family in necrosis, immunity and diseases. 


Pyroptosis is critical for host defences against infection and danger 
signals, and excessive pyroptosis causes immunological diseases 
and septic shock. Pyroptosis involves cell swelling and lysis, which 
causes massive release of cellular contents and thereby triggers strong 
inflammation’. The term pyroptosis was originally used to describe 
caspase-1-mediated macrophage death”. Caspase-1 belongs to the 
inflammatory caspase group, which also includes murine caspase-11 
and its human counterparts caspase-4 and -5 (ref. 3). Unlike 
caspase-11 (ref. 4), caspase-4 and -5 can also activate pyroptosis in 
non-monocytotic cells’. Caspase-1 acts downstream of the inflam- 
masome complex, which is scaffolded by an Nod-like receptor (NLR) 
protein, absent in melanoma 2 (AIM2) or pyrin, and recognizes bac- 
teria, other microbes and endogenous threats>®. Caspase-4, -5 and -11 
sense’ and are activated by direct binding to lipopolysaccharide 
(LPS)?”; hyperactivation of these caspases causes septic shock. 

Recent studies have identified the GSDMD protein, which criti- 
cally determines pyroptosis'®''. GSDMD is cleaved by all inflamma- 
tory caspases between its N-terminal gasdermin-N and C-terminal 
gasdermin-C domains. This cleavage releases the autoinhibition 
by gasdermin-C of gasdermin-N, which has intrinsic pyroptosis- 
inducing activity. The absence of GIDMD does not affect caspase-1 
processing of interleukin (IL)-1, but blocks mature IL-18 secretion, 
suggesting that pyroptosis is required for noncanonical cytokine 
secretion'®!!, Besides GIDMD, the gasdermin family also includes 
GSDMA, GSDMB, GSDMC, DENAS5 and DFNB59 (refs 12,13). Mice 
lack GSDMB but have three GGDMA (GSDMA1-3) and four GDDMC 
(GSDMC1-4) proteins. Other gasdermins are insensitive to inflam- 
matory caspases!?, Dominant mutations in Gsdma3 (refs 14-16) and 
DENAS (or autosomal recessive mutation in DFNB59)'”!8 cause alo- 
pecia and hyperkeratosis in mice and nonsyndromic hearing loss in 
humans, respectively. Disease-associated mutants of GIDMA3 and 
its gasdermin-N domain alone can activate pyroptosis owing to loss 
of autoinhibiton!”. Despite the importance of gasdermins in pyropto- 
sis and inflammation, the mechanisms of action of GSDMD and the 
gasdermin family are unknown. 


Cytotoxicity of gasdermin-N from multiple gasdermins 
All gasdermins, except for DFNB59, adopt a two-domain architec- 
ture. As found in GSDMD and GSDMA3 (ref. 10), the gasdermin-N 
domains of GIDMA, GSDMB, GSDMC or DENAS, but not the full- 
length proteins, induced extensive pyroptosis in human 293T cells 
(Extended Data Fig. la, b). This suggests that gasdermins in general 
are pyroptosis factors. Expression of the GIDMD gasdermin-N domain 
(GSDMD-N) is highly toxic to Escherichia coli (Extended Data Fig. 1c, d), 
whereas little cytotoxicity was observed with full-length GSDMD and 
its GSDMD-C domain (both proteins could be highly expressed in 
E. coli). Similar phenomena were observed with GIDMA, GSDMA3, 
GSDMC and DFNAS (Extended Data Fig. 1d). Thus, the gasdermin-N 
domain has intrinsic cytotoxicity in mammalian cells and its over- 
expression can also kill bacteria. 


Gasdermin-N domains can bind membrane lipids 

We hypothesized that the gasdermin-N domains might disrupt mem- 
branes to cause pyroptosis. To test this hypothesis, we assayed the bind- 
ing of recombinant gasdermin-N to membrane lipids. To circumvent 
the toxicity of gasdermin-N in E. coli, a PreScission protease (PPase) site 
(LEVLFQGP) was engineered into the inter-domain linker in GSDMD, 
GSDMA and GSDMA3 (it has been shown that PPase cleavage of the 
engineered GSDMD or GSDMA3 can trigger pyroptosis’’) and puri- 
fied full-length gasdermins. Notably, the gasdermin-N and -C domains 
remained bound together following in vitro PPase cleavage. When the 
noncovalent complexes (N+C) were incubated with liposomes con- 
taining phosphatidylcholine as the skeleton lipid, all three gasdermin-N 
(30-35 kDa) but not gasdermin-C domains (20-25 kDa) were preci- 
pitated by liposomes containing 10% or 20% phosphatidylinositol- 
4,5-bisphosphate (PtdIns(4,5)P2) (a major phosphoinositide in the 
plasma membrane), but not unphosphorylated phosphatidylinositol 
(Fig. la and Extended Data Fig. 2a, b). Neither full-length GSDMD, 
GSDMA and GSDMA3 nor their gasdermin-C domains bound 
phosphoinositide. Gasdermin-N could also bind to liposomes con- 
taining monophosphorylated (PtdIns3P, PtdIns4P and PtdIns5P), 


National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China. National Institute of Biological Sciences, Beijing, 102206, China. 
3Foshan University, Guangdong, 528000, China. “National Institute of Biological Sciences, Collaborative Innovation Center for Cancer Medicine, Beijing, 102206, China. 


00 MONTH 2016 | VOL 000 | NATURE | 1 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a Liposomes: 80% PC + 20% phosphoinositides Figure 1 | Lipid binding, biomembrane 
ee Pl PI(4,5)P, Pl PK(4,5)P, PI P(4,5)P, Pl PI(4,5)P, association and disruption by the 
ome PS Pm@s PSP ms PS. PmiS.P SP gasdermin-N domain. a, b, Binding of 
hd = bean -_ the gasdermin-N domain to membrane 
55 = =] es! —_ _-_ fee ‘ : : 
40 Pa p a Pa lipids. Purified gasdermin proteins were 
35 —_ — amp — incubated with liposomes with indicated 
a —_— — — — —— _ lipid compositions. After ultracentrifugation, 
a the liposome-free supernatant (S) and 
_ — — — the liposome pellet (P) were analysed by 
GSDMD-(N+C) GSDMD-FL GSDMAS-(N+C) GSDMAS3-FL SDS-PAGE and Coomassie blue staining. 
b Liposomes: 80% PC + 20% membrane lipids Eli; cardiolipin; PC, phosphatidylcholine; 
kDa PE Pl CL PE PI CL PE PIL CL PE PI CL PE, phosphatidylethanolamine; PI, 
#6 m4 SS ee ee eo °F Seo PSP SP phosphatidylinositol. The gasdermin-N and -C 
55 = ow « be noncovalent complex was obtained from inter- 
40 oa —_—o domain cleavage of the full-length (FL) protein. 
35 pod a ae c, Microscopy imaging of GIDMD-N localization 
oe a — aia. << ol in cells during pyroptosis. GIDMD-N(L192D)- 
1 eGFP was stably expressed in HeLa cells under 
15 ———E ee ae a tetracycline-inducible promoter. Shown 
GSDMD-(N+C) GSDMD-FL GSDMA3-(N+C) GSDMA3-FL are representative time-lapse cell images 
kDa PE PI CL PE _PI__ CL PE Pl CL (brightfield and fluorescence) taken from 4-5h 
70 Me ease @oP SP S PMS PS Pls P after doxycycline addition. Scale bar, 151m. 
Pe o hand FL: Full-length For videos of two representative cells, see 
55 oa wa : 
40 = ~- N: Gasdermin-N domain Supplementary Videos 1 and 2. d, e, Effects of 
Sem ow FF & =e C: Gasdermin-C domain extracellular or intracellular delivery of purified 
25 = - N+C: Gasdermin-N/C gasdermin proteins on iBMDM cell viability. 
> @ noncovalent complex — Equal amounts of indicated gasdermin proteins 
15 or PFO were added directly into cell culture 
GSDMA-(N+C) GSDMA-N GSDMA-FL ; medium (d) or electroporated into the cytosol 
c h:min‘s  (e). ATP-based cell viability is expressed as 
3 Ss mean ts.d. from three technical replicates. 
a All data shown are representative of three 
pay 
xe) independent experiments. 
a 
rm 
2 
a¢ 
si 
fa 
BS 
62 
a 
d iBMDM + extracellular addition of protein e iBMDM + protein electroporation 
+100 GSDMD GSDMA3 GSDMA 
& ‘60 re = 100 100 4p 
= = 80 
= 60 ae 80 
@ 40 = 60 60 60 
ba 6s 40 40 
= 20 8 40 
6 0 = 20 Py 20 20 
pro FLINWC)C FLINVC)FL_N 8 4g ‘ 5 we 
GSDMD GSDMA3 GSDMA FL (N+C) C FL (N+C) FL ON 


bisphosphorylated (PtdIns(3,4)P2 and PtdIns(3,5)P2) or triphos- 
phophorylated (PtdIns(3,4,5)P3) phosphatidylinositols (Extended Data 
Fig. 2b). Similar PtdIns(4,5)P. binding was observed with liposomes 
made of complicated phospholipid mixtures (phosphatidylcholine, 
phosphatidylethanolamine, phosphatidylserine, phosphatidylinositol 
and PtdIns(4,5)P>) that mimic plasma membrane lipid composition 
(Extended Data Fig. 2c). 

Cardiolipin and phosphatidylethanolamine are major bacterial 
membrane lipids. The gasdermin-N domains of GIDMD, GIDMA 
and GSDMA3, but not the full-length proteins or their gasdermin-C 
domains, were efficiently and specifically precipitated by cardiolipin 
liposomes (Fig. 1b). Reducing cardiolipin concentration from 20% 
to 10% decreased the binding efficiency (Fig. 1b and Extended Data 
Fig. 2d). Specific binding of gasdermin-N to cardiolipin, as well as the 
phosphoinositides, was also evident in the lipid-strip binding assay 
(Extended Data Fig. 2e). Thus, cardiolipin and phosphoinositide are 
two membrane lipid targets of gasdermin-N. 

For the three gasdermins GSDMD, GSDMA and GSDMA3, only 
the gasdermin-N domain of GSDMA could be separated from the 
noncovalent complex by high-salt buffer. The apo-GSDMA-N domain 
showed similar binding to cardiolipin or phosphoinositide liposomes as 
2 


| NATURE | VOL 000 | 00 MONTH 2016 


the noncovalent complex (Fig. 1b and Extended Data Fig. 2a, c, d), sug- 
gesting that lipid binding by GSDMA does not involve gasdermin-C. 
Moreover, the three gasdermin-N domains bound strongly to 
cardiolipin liposomes; GSDMA-N showed weaker binding to phos- 
phoinositide liposomes than GIDMD-N or GSDMA3-N despite their 
comparable pyroptosis-inducing activity (Extended Data Fig. la, b). A 
possible cause of this phenomenon is that artificial liposomes may not 
exactly recapitulate the complex lipid constituents of biomembranes. 


Membrane targeting of gasdermin-N during pyroptosis 

We next examined the localization of gasdermin-N during pyropto- 
sis. Extracts of Gsdmd~'~ immortalized bone marrow macrophages 
(iBMDMs) stably expressing Flag-GSDMD" were sequentially cen- 
trifuged (700g, 20,000g and 100,000g). Full-length GSDMD was found 
exclusively in the supernatant after 100,000g centrifugation (S100) 
(Extended Data Fig. 3a), suggesting a cytosolic localization. When 
pyroptosis was induced by LPS electroporation, GIDMD was cleaved 
into GIDMD-N and GSDMD-C; while GIDMD-C remained in $100, a 
large portion of GIDMD-N was distributed in the P7 heavy-membrane 
(pellet from 700g centrifugation) and P20 light-membrane fractions, 
which resembles the distribution of LAMP1 (endosome/lysosome), 


© 2016 Macmillan Publishers Limited. All rights reserved 


PI(4,5)P,, liposomes: 45% PC + 35% PE + 5% PS + 5% Pl + 10% PI(4,5)P, 


— GSDMD-(N+C) — PFO — GSDMA3-(N+C) — PFO — GSDMA-(N+C) — GSDMA-N 
— GSDMD-FL —CTL — GSDMAS-FL. =i — GSDMA-FL. — PFO 


100_]— GsDMD-c 100] — Gspma3-c 100] — GsDMA-c — CTL 

= 80 
3 60 60 
® 
@ 40 40 
% 20 20 
e 

ty) 0 


ae ae ea 
Time (min) Time (min) Time (min) 


Cardiolipin liposome: 80% PC + 20% cardiolipin 


— GSDMA3-(N+C) — PFO 
— GSDMD-FL —cCTL — GSDMA3-FL —CTL — GSDMA-FL — PFO 
100 — GSDMD-C ~~ GSDMA3-C —— GSDMA-C. — CTL 


100 100 
= 80 80 
3 60 60 
$ 40 40 40 
% 20 20 20 
F 9 0 0 


TI 
0 5 10 15 20 25 10 15 20 25 0 5 40 15 20 25 
Time (min) Time (min) Time (min) 


s 


— GSDMD-(N+C) — PFO — GSDMA-(N+C) — GSDMA-N 


Cc Liposome: 50% cholesterol d 
+ 50% PC Cardiolipin liposome: 80% PC + 20% cardiolipin 


— GSDMA3-(N+C) — GSDMA-N ™@ GSDMD-(N+C) @ GSDMA3-(N+C) ml GSDMA-N 


— GSDMD-(N+C) — PFO 100 @ GSDMD-FL @ GSDMA-FL 
100 — GSDMA-(N+C) — CTL 

= = 80 

= 80 

Z 60 60 

fo 40 40 

& 20 

fe) 20 

er 0 __ 


0. 
0 5 10 15 20 25 3kDa 10kDa40 kDa 3kDa10kDa40 kDa 3kDa10 kDa40 kDa 
Time (min) 


1 GSDMA3-FL 


Dextran release (%) 


Figure 2 | Liposome-leakage-inducing activity of the gasdermin-N 
domain. Liposomes with indicated lipid compositions were treated with 
purified gasdermin proteins or PFO as indicated. a-c, Liposome leakage 
was monitored by measuring 2,6-pyridinedicarboxylic acid (DPA) 
chelating-induced fluorescence of released Tb**. Time course of relative 
Tb?* release is shown. CTL, control. d, Different-size fluorescent dextrans 
were encapsulated into the liposome and the released dextran fluorescence 
was determined. Triton X-100 treatment was used to achieve 100% 
liposome leakage. Final leakage of the dextrans is expressed as mean + s.d. 
from three technical replicates. All data shown are representative of three 
independent experiments. 


Na, K-ATPase «1 (plasma membrane) and syntaxin 6 (trans-Golgi) 
(Extended Data Fig. 3a). GIDMD-N was also found in the lightest 
P100 fraction. As expected, actin and mitochondrial proteins (COX4 
and TOM20) were found in the $100 and P7 fractions, respectively. The 
gasdermin-N domain of PPase-cleavable GSDMA showed the same 
distribution in pyroptotic 293T cells (Extended Data Fig. 3b). Thus, 
certain gasdermin-N domains moved to heterogeneous membranes 
during pyroptosis, echoing the binding to various phosphoinositides. 

To visualize the membrane targeting, GIDMD-N and GIDMA3-N 
fused to enhanced green fluorescent protein (eGFP) were inducibly 
expressed in HeLa cells. The pyroptosis triggered by gasdermin-N 
expression was too severe and rapid to allow accumulation of flu- 
orescence signals sufficient for detection. This was overcome by a 
mutation of leucine192 to aspartate (L192D), which slowed down 
the pyroptosis (see below). GIDMD-N(L192D) showed initial cyto- 
plasmic distribution, and some of it was translocated and accumu- 
lated on the plasma membrane as pyroptosis progressed; the cells 
then developed characteristic swelling bubbles and became ruptured 
(Fig. lc and Supplementary Videos 1, 2). Similar results were 
obtained with GSDMA3-N containing an equivalent L184D mutation 
(Extended Data Fig. 3c and Supplementary Videos 3, 4). 


Biomembrane disruption correlates with lipid binding 

Perfringolysin O (PFO) is a cholesterol-targeting pore-forming toxin 
from Clostridium perfringens. Extracellular addition of purified PFO 
to iB MDMs caused severe cytotoxicity (Fig. 1d). By contrast, the non- 
covalent complex of cleaved GSIDMD or GSDMA3 or the GIDMA-N 
domain induced cell lysis only when delivered cytosolically but not 
when administered extracellularly (Fig. 1d, e). Identical results were 


ARTICLE 


obtained in 293T cells (Extended Data Fig. 3d, e). These findings are 
consistent with the presence of cholesterol in the exoplasmic leaflet of 
the plasma membrane and the localization of phosphoinositides, the 
targets of gasdermin-N, only in the cytoplasmic leaflet. To perform a 
similar assay in bacteria, we used protoplasts of Gram-positive Bacillus 
megaterium containing a single cardiolipin-containing membrane. The 
protoplasts were completely lysed by the GIDMD-(N-+C) noncova- 
lent complex, but not by full-length GSDMD or GSDMD-C (Extended 
Data Fig. 3f, g). PFO also caused no protoplast lysis, consistent with 
the absence of cholesterol in bacterial membranes. Similar results were 
obtained with the gasdermin-N domains of GIDMA and GSDMA3 
(Extended Data Fig. 3g). Thus, membrane disruption by gasdermin-N 
correlates well with its lipid-binding properties. 


Liposome leakage triggered by the gasdermin-N domain 
We then tested the induction of liposome leakage by the gasdermin-N 
domain. The noncovalent complex of cleaved GIDMD or GIDMA3 
caused around 50% leakage of PtdIns(4,5)P> liposomes (Extended Data 
Fig. 4a). The leakage reached 100% when PtdIns(4,5)P2 was recon- 
stituted into liposomes containing complicated phospholipid mix- 
tures (Fig. 2a). Consistent with the binding data, GSDMA-N induced 
less liposome leakage (Fig. 2a and Extended Data Fig. 4a); liposome 
leakage reached about 50% with higher concentrations of GIDMA-N 
(Extended Data Fig. 4b). All three gasdermins caused nearly 100% leak- 
age of the cardiolipin liposome (Fig. 2b). Full-length gasdermins and 
gasdermin-C did not lyse either type of liposome (Fig. 2a, b and Extended 
Data Fig. 4a). The gasdermins had no effect on cholesterol- or phos- 
phatidylethanolamine-reconstituted liposomes (Fig. 2c and Extended 
Data Fig. 4c). As expected, PFO lysed cholesterol-containing lipos- 
omes but not those containing PtdIns(4,5)P» or cardiolipin (Fig. 2a—c 
and Extended Data Fig. 4a). These results are consistent with the finding 
that PFO but not gasdermin-N could lyse mammalian cells from the out- 
side (Fig. 1d and Extended Data Fig. 3d). When liposomes encapsulating 
different-size fluorescent dextrans were assayed, the active forms of 
GSDMD, GSDMA3 and GSDMA could release dextrans with molec- 
ular masses of 3 or 10 kDa but not 40kDa (Fig. 2d). This indicates that 
items with diameters of 10nm or less can pass through the presumed 
pores formed by gasdermin-N. 


Oligomerized gasdermin-N forms membrane pores 
Full-length GSDMD, GSDMA3 and GSDMA were monomeric in 
solution (Extended Data Fig. 5a). Upon crosslinking of the GIDMD- 
or GSDMA3-(N-+C) noncovalent complex or the GIDMA-N domain, 
GSDMD-N remained monomeric and GIDMA3/GSDMA-N showed 
a low degree of artificial oligomerization (Extended Data Fig. 5b). 
When crosslinking was performed after liposome incubation, all three 
gasdermin-N domains appeared as high-order oligomers (Extended 
Data Fig. 5b). As a control, PFO was converted from monomers 
into SDS-resistant oligomers after liposome incubation’. In LPS- 
stimulated pyroptotic iB MDMs, membrane-associated GSDMD-N, 
resulting from caspase-11 cleavage, also formed high-order oligomers, 
whereas full-length GSDMD and cytosolic GSDMD-N remained 
monomeric (Extended Data Fig. 5c). Similar results were obtained 
with the PPase-cleavable GSDMA in pyroptotic 293T cells (Extended 
Data Fig. 5c). 

Negative-stain electron microscopy revealed multiple pores on 
nearly all cardiolipin liposomes that had been incubated with the 
GSDMD- or GSDMA3-(N+C) noncovalent complex, but not with 
full-length GSDMD or GSDMA3 (Fig. 3a). Similar (but fewer) mem- 
brane pores were formed on PtdIns(4,5)P2-containing liposomes 
(Extended Data Fig. 5d); both intact and severely fragmented liposomes 
caused by merging of adjacent pores were observed. The pore-forming 
efficiency was markedly improved by reconstituting PtdIns(4,5)P2 
into liposomes containing the complicated phospholipid mixtures 
(Fig. 3a), consistent with the finding that these liposomes showed 
more severe leakage (Fig. 2a and Extended Data Fig. 4a). Furthermore, 


00 MONTH 2016 | VOL 000 | NATURE | 3 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a GSDMD 


GSDMA3 


GSDMD 


N+C 


Full-length 


Figure 3 | Membrane pore-forming activity 
of the gasdermin-N domain. a, b, Liposomes 
with indicated lipid compositions (a) or 
prepared using bovine liver-derived polar 
lipid extracts (b) were treated with indicated 
gasdermin proteins. Shown are representative 
negative-stain electron microscopy 
micrographs of the liposomes (scale bar, 

100 nm). Insets in a, expanded view of a 
representative pore (scale bar, 15 nm). All data 
shown are representative of three independent 
experiments. 


GSDMA3 


80% PC + 20% cardiolipin 
GSDMD-FL 


GSDMD-(N+C) 


GSDMA3-(N+C) 


45% PC + 35% PE +5% PS + 5% PI + 10% Pl(4,5)P, 


GSDMA3-FL 


Liposomes made of liver polar lipid extracts 


the gasdermin-N domains of GIDMD, GSDMA3 and GSDMA could 
bind robustly to liposomes made of bovine liver or brain-derived lipid 
extracts (Extended Data Fig. 2f) and accordingly generated similar 
pores on the liposomes (Fig. 3b). 

Of the GSDMD-induced pores, 60% had inner diameters of 
10-16 nm whereas all GIDMA3-induced pores had inner diameters 
of 10-14nm (Extended Data Fig. 6a, b). Decreasing the gasdermin 
concentration by 10-fold affected the number but not the size dis- 
tribution of the pores (Extended Data Fig. 6a, b). The pore size is 
consistent with the assessment from the dextran leakage data (Fig. 2d). 
The wider pore-size range produced by GSDMD, compared with 
GSDMA3, probably resulted from the less optimal properties of 
recombinant GSDMD. Previously, the inner diameter of membrane 
pores in caspase-1-mediated pyroptosis was estimated to be 1.1- 
2.4nm, on the basis of the observation that PEG2000 but not PEG200, 
at an equal molar concentration, can block pyroptosis””. We confirmed 
this finding but further found that increasing the mass concentration 
of PEG200 to the same as PEG2000 (to ensure equal osmotic potential) 
could generate the same protective effect (Extended Data Fig. 6c). 
Considering that gasdermin-N formed similar pores on liposomes 
made of natural lipid extracts, pores with inner diameters of 10-14nm 
are likely to predominate in vivo. This pore size can allow the passage 
of mature IL-1 (also IL-18) and caspase-1, which have diameters of 
4.5 and 7.5nm, respectively. GIDMD-N and GSDMA3-N also formed 
pores on lipid monolayers; GIDMA3-N pores had a uniform inner 
diameter of about 14nm (Extended Data Fig. 6d). Following 2D classi- 
fication, one class of GIDMA3 pores showed the best protein contrast, 
and subsequent rotational auto-correlation analysis revealed 16-fold 
symmetry (Extended Data Fig. 6e). Given the single-layer property 
of all known pore-forming complexes, these data suggest that the 
gasdermin-N domain forms a 16-mer pore complex. 


Crystal structure of GIDMA3 

We determined the 1.90 A crystal structure of GIDMA3 (Extended 
Data Table 1 and Extended Data Fig. 7a).The structure is separated 
into two domains—the presumed gasdermin-N and gasdermin-C 
domains (Fig. 4a). Gasdermin-N mainly contains an extended 


4 | NATURE | VOL 000 | 00 MONTH 2016 


twisted 8-sheet formed by nine tandem strands (83-11) (Fig. 4a 
and Extended Data Fig. 7b). The N-terminal «1 helix and follow- 
ing 81-82 hairpin lie in the concave of the }-sheet. Helices a2 and 
«3 flank the 3-sheet at one end. Helix a4 protrudes away from the 
other end of the B-sheet through two loops to contact the helical 
gasdermin-C domain. The C domain adopts a compact globular fold 
covered by a short three-stranded 6-sheet (812-814). The long loop 
(residues 234-263) linking the GIDMA3-N and -C domains stretches 
away from the main body of the structure. A structural homology 
search”! revealed no meaningful information about gasdermin-C; 
the structure of gasdermin-N also showed no convincing similarity 
to any known proteins, suggesting that it represents a new type of 
pore-forming protein. 


Functional analyses of GSDMA3 autoinhibition 

In the structure, the al helix and 61-82 hairpin in GIDMA3-N 
provide the primary surface for binding to the GIDMA3-C domain 
(Fig. 4b). F48 and W49 from the hairpin loop are inserted deeply 
into a groove in GSDMA3-C and encircled by L270, Y344, A348 and 
A443, forming a hydrophobic core (Fig. 4c). In addition, R43, K44 
and T46 from the hairpin have four hydrogen bonds with E273, E277 
and D340; al also supplies Dé and R13 for hydrogen-bonding with 
H436 and D433. At the second inter-domain interface, a4 presents 
its hydrophobic face (L181, L184 and L186) to a non-polar surface 
formed by a9 and all in GSDMA3-C (Fig. 4c). Nine Gsdma3 muta- 
tions that can cause alopecia and hyperkeratosis in mice have been 
identified'*"'*. Among them, 259RDW (insertion after residue 259 
with three mistranslated residues RDW) and 366stop (a premature 
stop at residue 366) encode truncated GSDMA3 devoid of inter-do- 
main contacts; Y344C, Y344H and A348T are mutations in residues 
that directly contact GIDMA3-N; T278P and L343P are near the 
direct-contacting residues; and 412EA (duplication of E411A412) 
disrupts the a4-binding surface. Consistently, T278P, L343P, Y344C, 
A348T and 412EA all weaken inter-domain interactions, resulting 
in constitutive activation of GIDMA3 (ref. 10). Similarly, GIDMA3 
L270D, Y344D and A348D exhibited spontaneous pyroptosis-induc- 
ing activity (Fig. 4d). 


© 2016 Macmillan Publishers Limited. All rights reserved 


d 293T cells 


100 

g@ 8 

2 60 

3 

oO 

$40 

a 

~~ ail 

9 © O & 


3xFlag-GSDMA3_ 3xFlag-GSDMD GSDMA-Flag GSDMC-Flag DFNAS-Flag 


Figure 4 | Crystal structure of GSDMA3 and structural autoinhibition 
of the gasdermin family. a, b, Overall structure of GIDMA3 and the inter- 
domain interfaces. The gasdermin-N (GSDMA3-N) and gasdermin-C 
(GSDMA3-C) domains are coloured green and yellow, respectively. 
Structures of GIDMA3-N (a, b) and GSDMA3-C (a) are shown as cartoon 
models, and that of GIDMA3-C (b) is in a transparent surface scheme. 
Secondary structure elements are labelled in a. The primary and second 
inter-domain interfaces are highlighted by large and small blue ellipses, 
respectively (b). Disordered loops are indicated by dashed lines. 

c, Close-up view of the autoinhibitory interactions. Left and right, primary 
and secondary inter-domain interfaces, respectively. Residues involved 

in the autoinhibitory interactions are labelled and shown as sticks. Point 
mutations in GSDMA3 that cause alopecia in mice are coloured pink. 
Dotted lines, hydrogen bonds. d, Mutation analyses of the autoinhibitory 
contacts. Full-length GSDMA3, GSDMD, GSDMA, GSDMC or DFNA5 
(wild type (WT) or containing indicated point mutations in their 
gasdermin-C domains) was transfected into 293T cells. ATP-based cell 
viability is expressed as mean + s.d. from three technical replicates; data 
are representative of three independent experiments. 


Conserved autoinhibition in the gasdermin family 

GSDMD shares about 70% homology with GIDMA3. Homology-based 
modelling produced a highly analogous GSDMD structure (Extended 
Data Fig. 7c). The structure contains a hydrophobic core resembling 
that in GSDMA3, in which F49/W50 play equivalent roles to F48/W49 
in GSDMA3. The modelled structure bears a similar second inter- 
domain contact; an a4-equivalent helix has extensive hydrophobic 
interactions with GSDMD-C despite the residues involved being 
different. The residues in gasdermin-C that make the hydrophobic 
core are highly conserved in the gasdermin family (Extended Data 
Fig. 8), including above analysed L270, Y344 and A348 in GSDMA3. 
When equivalent residues in GSDMD (L290, Y373 and A377), GDDMA 
(L260, Y334 and A338), GSDMC (L319, Y398 and A402) and DFNA5 


ARTICLE 


a 293T cell transfection kDa Pl PI(4,5)P, Pl PI(4,5)P, Pl PI(4,5)P, 
cs is P MS PS POs Pp os Pp 
ae 70 z = 
& 80 55 | ~- _ 
= 40 - ae 
= 60 35 
3 Pod ead be _ 
$40 .—— - -— - = oe 
- — — —_— — i 
© 20 
6 15 o o 
- w@w PE CL PE CL PE CL 
@soup . & ca se 
ww =s P Ss Pas P S Pas iP Ss P 
< 70 -_ - 
—— 55 - _ 
GSDMD-N 40 don - 
35 i — - -_ 
— - — 
o-Flag es 
GSDMD —_-WT-(N+C) L192D-(N+C)  E15K/L192D-(N+C) 
c d Cardiolipin liposome 
SSDMB GSDMD-FL 3 
PI(4,5)P,, liposome . 
400+] — WT-(N+C) oH 
S — WT-FL ¢g * 
& 8074 — 1192p-(N+C) 
8 60-1 — E15K/L192D-(N+C) 
2 
2 40-4 
2 204 
0 
Cardiolipin liposome -(N+C) 
100- BS 
& 8074 
3 
8 605 
oO 
; 407 
se) 
F 20-4 
(0) 


0° 5 10 15 20 25 fom 
Time (min) 
Figure 5 | Residues in the autoinhibited region in gasdermin-N are 
important for pyroptosis, membrane disruption and pore formation. 
a, Effects of L192D/E15K mutations in GIDMD-N on pyroptosis-inducing 
activity. Full-length GSDMD or GSDMD(1-275) (wild type or with 
indicated mutations) was transfected into 293T cells. ATP-based cell viability 
is expressed as mean £s.d. from three technical replicates. The immunoblot 
shows expression of transfected GSDMD. b, c, Effects of L192D/E15K 
mutations on GIDMD-N lipid-binding and liposome-leakage-inducing 
activities. Purified GSDMD proteins were incubated with liposomes 
containing 80% phosphatidylcholine and 20% phosphatidylinositol, 
PtdIns(4,5)P>, phosphatidylethanolamine or cardiolipin. After 
ultracentrifugation, the liposome-free supernatant (S) and the liposome 
pellet (P) were analysed by SDS-PAGE (b). Liposome leakage was 
monitored by measuring DPA chelating-induced fluorescence of released 
Tb** relative to that of Triton X-100 treatment (c). d, Effects of L192D/ 
E15K mutations on pore formation by GIDMD-N. Shown are representative 
negative-stain electron microscopy micrographs of pores formed by 
indicated GSDMD proteins on cardiolipin liposomes (scale bar, 100 nm). All 
data shown are representative of three independent experiments. 


(1313, F388 and A392) were individually mutated into aspartates, 
20-80% pyroptosis occurred in 293T cells expressing the mutant 
proteins (except for DFNA5 1313D) (Fig. 4d). Thus, structural autoin- 
hibition is conserved in most gasdermins. 


Structure-based analyses of gasdermin-N function 

L184 in GSDMA3-N (L192 in GIDMD-N) on the «4 helix is contacted 
by the inhibitory gasdermin-C domain (Fig. 4c and Extended Data 
Fig. 7c). E14 in GSDMA3-N (E15 in GIDMD-N) on helix a1 is within 
the primary inter-domain interface (Fig. 4c). Mutations of gasdermin-C 
residues that contact L184 or structural regions around E14 caused 
constitutive activation of GIDMA3 and GSIDMD"” (Fig. 4d). As these 
mutations will disrupt the autoinhibition and expose L184/E14, we 
reasoned that L184/E14 or their adjacent residues might be impor- 
tant for pyroptosis. Supporting this prediction, GIDMD-N(L192D) 
and GSMDA3-N(L184D) induced markedly decreased pyroptosis in 
293T cells (Fig. 5a and Extended Data Fig. 9a). The mutants showed 
evident defects in binding the liposome and causing the liposome 


00 MONTH 2016 | VOL 000 | NATURE | 5 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


leakage (Fig. 5b, c and Extended Data Fig. 9b, c). Combining the L/D 
mutation with the also partially inactive E15K mutation on GSDMD 
(E14K on GSDMA3) led to further decreased liposome binding and 
leakage-inducing activities. Double mutants of the GSDMD- and 
GSDMA3-(N+C) complexes formed fewer pores than the L/D single 
mutant (Fig. 5d and Extended Data Fig. 9d). GIDMD-N(L192D/E15K) 
and GSDMA3-N(L184D/E14K) were largely deficient in pyroptosis 
induction (Fig. 5a and Extended Data Fig. 9a). The two residues prob- 
ably participate in oligomerization and membrane insertion during 
pore formation. These data reinforce the idea that the liposome-leakage 
and pore-forming activities of gasdermin-N are responsible for cell 


pyroptosis. 


Discussion 

We show that multiple gasdermin-N domains can induce pyroptosis 
owing to their pore-forming activity. Most gasdermin pores have inner 
diameters of 10-14nm. The structure of GIDMA3 uncovers an autoin- 
hibitory mechanism that is conserved in the gasdermin family. Other 
members of the gasdermin family may also act on endomembranes 
and alter other cellular physiologies. Indeed, DFNB59 and DFNA5 
have been suggested to localize on the peroxisomes and mitochondria, 
respectively”>”?, 

The gasdermin-N domain represents a new type of pore-forming 
protein (PFP). PFPs are diverse in sequence and present in many 
domains of life’*. Most PFPs lyse cell membranes from the outside. 
By contrast, gasdermins and the MLKL protein (which is involved in 
necroptosis) kill cells from the cytosol. MLKL can induce liposome 
leakage, but there is no reported evidence that it can form pores”>~”*. 
PFPs can be divided into a-helical and 8-barrel classes on the basis 
of the structures of their membrane-spanning regions”. The struc- 
ture of gasdermin-N differs completely from those of a-class PFPs. A 
Dali search of the GIDMA3-N structure produced hits all belonging 
to the membrane-attack complex/perforin/cholesterol-dependent 
cytolysin (MACPF/CDC) family of 8-class PFPs*°, but the score was 
not confident and no meaningful structural similarity could be iden- 
tified (Extended Data Fig. 7d). Gasdermin-N is likely to use either a 
3-barrel-like or a distinct mechanism for pore formation, which 
will involve drastic conformational changes for insertion into the 
membrane. Our results pave the way for future studies to elucidate 
the structural mechanism of pore formation by gasdermins. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 6 March 2015; accepted 18 May 2016. 
Published online 8 June 2016. 


1. Jorgensen, |. & Miao, E. A. Pyroptotic cell death defends against intracellular 
pathogens. /mmunol. Rev. 265, 130-142 (2015). 

2. Cookson, B. T. & Brennan, M. A. Pro-inflammatory programmed cell death. 
Trends Microbiol. 9, 113-114 (2001). 

3. Shi, J. et al. Inflammatory caspases are innate immune receptors for 
intracellular LPS. Nature 514, 187-192 (2014). 

4. Kayagaki, N. et al. Non-canonical inflammasome activation targets caspase-11. 
Nature 479, 117-121 (2011). 

5. Lamkanfi, M. & Dixit, V. M. Mechanisms and functions of inflammasomes. Cell 
157, 1013-1022 (2014). 

6. Zhao, Y. & Shao, F. Diverse mechanisms for inflammasome sensing of cytosolic 
bacteria and bacterial virulence. Curr. Opin. Microbiol. 29, 37-42 (2016). 

7. Hagar, J. A., Powell, D. A., Aachoui, Y., Ernst, R. K. & Miao, E. A. Cytoplasmic LPS 
activates caspase-11: implications in TLR4-independent endotoxic shock. 
Science 341, 1250-1253 (2013). 

8. Kayagaki, N. et al. Noncanonical inflammasome activation by intracellular LPS 
independent of TLR4. Science 341, 1246-1249 (2013). 

9. Yang, J., Zhao, Y. & Shao, F. Non-canonical activation of inflammatory caspases 
by cytosolic LPS in innate immunity. Curr. Opin. Immunol. 32, 78-83 (2015). 

10. Shi, J. et al. Cleavage of GSDMD by inflammatory caspases determines 
pyroptotic cell death. Nature 526, 660-665 (2015). 

11. Kayagaki, N. et al. Caspase-11 cleaves gasdermin D for non-canonical 
inflammasome signalling. Nature 526, 666-671 (2015). 


6 | NATURE | VOL 000 | 00 MONTH 2016 


12. Saeki, N. & Sasaki, H. in Endothelium and Epithelium: Composition, Functions, 
and Pathology (eds Carrasco, J. & Matheus, M.) Ch. IX, 193-211 (Nova Science 
Publishers, 2011). 

13. Tanaka, S., Mizushina, Y., Kato, Y., Tamura, M. & Shiroishi, T. Functional 

conservation of Gsdma cluster genes specifically duplicated in the mouse 

genome. G3 (Bethesda) 3, 1843-1850 (2013). 

14. Sato, H. et al. Anew mutation Rim3 resembling Re” is mapped close to 

retinoic acid receptor alpha (Rara) gene on mouse chromosome 11. 

Mamm. Genome 9, 20-25 (1998). 

15. Porter, R. M. et al. Defolliculated (dfl): a dominant mouse mutation leading to 

poor sebaceous gland differentiation and total elimination of pelage follicles. 

J. Invest. Dermatol. 119, 32-37 (2002). 

16. Runkel, F. et a/. The dominant alopecia phenotypes Bareskin, Rex-denuded, 

and Reduced Coat 2 are caused by mutations in gasdermin 3. Genomics 84, 

824-835 (2004). 

17. Van Laer, L. et al. Nonsyndromic hearing impairment is associated with a 

mutation in DFNA5. Nat. Genet. 20, 194-197 (1998). 

18. Delmaghani, S. et al. Mutations in the gene encoding pejvakin, a newly 

identified protein of the afferent auditory pathway, cause DFNB59 auditory 

neuropathy. Nat. Genet. 38, 770-778 (2006). 

19. Shepard, L. A., Shatursky, O., Johnson, A. E. & Tweten, R. K. The mechanism of 
pore assembly for a cholesterol-dependent cytolysin: formation of a large 
prepore complex precedes the insertion of the transmembrane beta-hairpins. 
Biochemistry 39, 10284-10293 (2000). 

20. Fink, S. L., Bergsbaken, T. & Cookson, B. T. Anthrax lethal toxin and Salmonella 
elicit the common cell death pathway of caspase-1-dependent pyroptosis via 
distinct mechanisms. Proc. Natl Acad. Sci. USA 105, 4312-4317 (2008). 

21. Holm, L. & Rosenstrém, P. Dali server: conservation mapping in 3D. Nucleic 
Acids Res. 38, W545-W549 (2010). 

22. Delmaghani, S. et al. Hypervulnerability to sound exposure through impaired 
adaptive proliferation of peroxisomes. Cell 163, 894-906 (2015). 

23. Van Rossom, S., Op de Beeck, K., Hristovska, V., Winderickx, J. & Van Camp, G. 
The deafness gene DFNA5 induces programmed cell death through 
mitochondria and MAPk-related pathways. Front. Cell. Neurosci. 9, 231 
(2015). 

24. Bischofberger, M., lacovache, |. & van der Goot, F. G. Pathogenic pore- 
forming proteins: function and host response. Cell Host Microbe 12, 
266-275 (2012). 

25. Hildebrand, J. M. et al. Activation of the pseudokinase MLKL unleashes the 
four-helix bundle domain to induce membrane localization and necroptotic 
cell death. Proc. Natl Acad. Sci. USA 111, 15072-15077 (2014). 

26. Dondelinger, Y. et al. MLKL compromises plasma membrane integrity by 
binding to phosphatidylinositol phosphates. Cel! Reports 7, 971-981 (2014). 

27. Wang, H. et al. Mixed lineage kinase domain-like protein MLKL causes necrotic 
membrane disruption upon phosphorylation by RIP3. Mol. Cell 54, 133-146 
(2014). 

28. Cai, Z. et al. Plasma membrane translocation of trimerized MLKL protein is 
required for TNF-induced necroptosis. Nat. Cell Biol. 16, 55-65 (2014). 

29. lacovache, |., Bischofberger, M. & van der Goot, F. G. Structure and assembly of 

pore-forming proteins. Curr. Opin. Struct. Biol. 20, 241-246 (2010). 

30. Reboul, C. F., Whisstock, J. C. & Dunstone, M. A. Giant MACPF/CDC pore forming 

‘oxins: A class of their own. Biochim. Biophys. Acta 1858, 475-486 (2016). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements We thank W. Wei for reagents, H. Wang for suggestions 
on electron microscopy data analysis, and the staff of beamlines BL18U1 
and BL19U1 at National Center for Protein Sciences, Shanghai, and 
Shanghai Synchrotron Radiation Facility for X-ray data collection. This work 
was supported by grants from the Strategic Priority Research Program of 
the Chinese Academy of Sciences (XDBO8020202), the China National 
Science Foundation Program for Distinguished Young Scholars (31225002) 
and Program for International Collaborations (31461143006), and the 
National Basic Research Program of China 973 Program (2012CB518700 
and 2014CB849602) to F.S. The research was also supported in part by an 
International Early Career Scientist grant from the Howard Hughes Medical 
Institute and the Beijing Scholar Program to F.S. 


Author Contributions J.D., D.-C.W. and FS. conceived the study; J.D., together 
with K.W., designed and performed the majority of the experiments; 

Y.S. helped with protein purification; W.L. performed the pyroptosis assay; 
Q.S. assisted J.D. in electron microscopy studies; J.S. provided critical 
reagents and suggestions; H.S. performed structural modelling; and J.D. and 
F.S. analysed the data and wrote the manuscript. All authors discussed the 
results and commented on the manuscript. 


Author Information The atomic coordinates and structure factors of GSDMA3 
have been deposited in the Protein Data Bank under the accession code 
5BODR. Reprints and permissions information is available at www.nature.com/ 
reprints. The authors declare no competing financial interests. Readers are 
welcome to comment on the online version of the paper. Correspondence and 
requests for materials should be addressed to D.-C.W. (dcwang@sun5.ibp.ac.cn) 
or F.S. (shaofeng@nibs.ac.cn). 


Reviewer Information Nature thanks E. Miao, F. Sigworth and the other 
anonymous reviewer(s) for their contribution to the peer review of this work. 


© 2016 Macmillan Publishers Limited. All rights reserved 


METHODS 


The experiments were not randomized. The investigators were not blinded to allo- 
cation during experiments and outcome assessment. 

Plasmids, antibodies and reagents. Complementary DNA (cDNA) for human 
GSDMA, GSDMB, GSDMC, GSDMD and mouse Gsdma3 were previously 
described!°; cDNA for human DFNA5 was obtained from Life Technologies 
Ultimate ORF collection (IOH41276). The cDNAs were inserted into a modi- 
fied pCS2 vector with an N-terminal 3 x Flag tag or the pcDNA3 vector with a 
C-terminal Flag tag for transient expression in 293T cells and the pWPllentiviral 
vector with an N-terminal 2 x Flag-HA tag for stable expression in iB MDM cells. 
For Tet-On expression, the cDNAs were inserted into a modified pLenti-NIrD 
vector (a gift from W. Wei) harbouring the blasticidin-resistance gene and fused 
with a C-terminal eGFP tag. For growth inhibition in E. coli, the cDNAs were 
cloned into the pET21a vector. For recombinant expression in E. coli, the cDNAs 
were cloned into a modified pET vector with an N-terminal 6 x His-SUMO tag 
or pGEX-6P-2 with an N-terminal GST tag. DNA for PFO was amplified from 
genomic DNA of C. perfringens. For recombinant expression of PFO in E. coli, the 
DNA was cloned into a pET28a vector with an N-terminal 6 x His tag. Truncation 
mutants of gasdermins were constructed by the standard PCR cloning strategy 
and inserted into the corresponding vectors with indicated tags. Point mutations 
were generated by the QuickChange Site-Directed Mutagenesis Kit (Stratagene). 
All plasmids were verified by DNA sequencing. 

Antibodies used in this study include anti-Flag M2 (F4049), anti-actin (A2066) 
and anti-tubulin (T5168) (Sigma-Aldrich); anti-TOM20 (sc-11415) and anti- 
Lamp2 (sc-18822) (Santa Cruz Biotechnology); anti-Cox4 (11967, Cell Signaling 
Technology); anti-LAMP1 (553792) and anti-sytaxin 6 (610635) (BD Pharmingen); 
and anti-Na and K-ATPase al (2047-1, Epitomics). Rabbit antiserum for the 
GSDMD-C domain was generated in our in-house facility. Natural and synthetic 
lipid products used for liposome preparation were purchased from Avanti Polar 
Lipids Inc. Lipid strips used in the protein-lipid overlay assay were obtained from 
Echelon Biosciences Inc. Fluorescein-labelled dextran was from Life Technologies. 
LPS (L4524), terbium chloride (TbCl;) and DPA were purchased from Sigma- 
Aldrich. Cell culture products were from Life Technologies and all other chemicals 
used were from Sigma-Aldrich unless noted. 

Cell culture and transfection. HeLa and 293T cells were obtained from the 
American Type Culture Collection (ATCC). C57BL/6 mouse-derived wild-type 
and Gsdmd~'~ iBMDM cells were as described!.The cells are frequently checked 
by virtue of their morphological features and functionalities but have not been 
subjected to authentication by short tandem repeat (STR) profiling. All cell lines 
have been tested to be mycoplasma-negative by the commonly used PCR method. 
Cells were grown in DMEM supplemented with 10% (v/v) fetal bovine serum (FBS) 
and 2mM t-glutamine at 37°C in a 5% CO) incubator. Transient transfection in 
HeLa and 293T cells was performed using the Jet-PRIME (Polyplus Transfection) 
or Vigofect (Vigorous) reagents following the manufacturers’ instructions. iB MDM 
or HeLa stable cell lines were generated by lentiviral infection as previously 
described'®. Stable expression cells were sorted by flow cytometry (BD Biosciences 
FACSAria II) or selected by 60,1g m1! blasticidin (Invitrogen). 

Microscopy imaging. To examine morphology of pyroptotic cells, cells were 
treated as indicated in 6-well plates (NuncProducts, Thermo Fisher Scientific Inc.). 
Static brightfield images of pyroptotic cells were captured using an Olympus [X71 
microscope. To examine the subcellular localization of the gasdermin N-domain 
during pyroptosis, HeLa cells harbouring the desired Tet-On expression plasmid 
were treated with 21g ml“! doxycycline in glass-bottom culture dishes (MatTek 
Corporation). After 3h, live images of pyroptotic cells were recorded with a 
PerkinElmer UltraVIEW spinning disk confocal microscope and processed in 
the Volocity program. All image data shown are representative of at least three 
randomly selected fields. 

Cell viability and osmotic protection. For viability assay, relevant cells were 
treated as indicated and the viability was determined by the CellTiter-Glo 
Luminescent Cell Viability Assay (Promega). To examine effects of osmotic pro- 
tection, iB MDM cells harbouring a sensitive Nirp1b allele were treated with 1.2% 
or 12% (w/v) osmoprotectants (PEG200, PEG1500 or PEG2000) or 404M zVAD 
for 1h. Pyroptosis was induced by LFn-BsaK*! or anthrax lethal toxin stimulation®”, 
which activate the NAIP2/NLRC4 inflammasome or NLRP1B inflammasome, 
respectively. Cell death was measured by the LDH release assay using the CytoTox 
96 Non-Radioactive Cytotoxicity Assay kit (Promega). 

Cell fractionation by differential centrifugations. Cells were collected by 
centrifugation at 1,000g for 10 min. The washed cell pellets were resuspended 
in 5 volumes of buffer A (20 mM HEPES, pH 7.5, 40 mMKCI, 1.5mM MgCh, 
1mM EDTA and 250 mM sucrose) and incubated on ice for 30 min. The cells were 
homogenized by passing through a 22G needle 24 times. After centrifugation at 
1,000g for 10 min, the supernatant was collected and re-centrifuged at 7,000g for 


ARTICLE 


10 min. The supernatant and pellet were designated as the $7 and P7 fraction, 
respectively. The S7 fraction was centrifuged again at 20,000g for 20 min to obtain 
the S20 and P20 fractions. The S20 fraction was subjected to final centrifugation 
at 100,000g for 1h and the supernatant was collected as the S100 fraction. The 
pellet was dissolved in buffer A as the P100 fraction. All centrifugations were per- 
formed at 4°C. The fractions were solubilized in SDS loading buffer and analysed 
by immunoblotting as indicated. 

Purification of recombinant proteins. To obtain full-length GSDMD, GSDMA 
and GSDMA3 proteins, E. coli BL21 (DE3) cells harbouring the gasdermin plasmid 
(pET28a-6 x His-SUMO vector) were grown in LB medium supplemented with 
30,.g ml! kanamycin. Protein expression was induced overnight at 20°C with 
0.4mM IPTG after ODg¢oo reached 0.8. Cells were lysed in the buffer contain- 
ing 20 mM Tris-HCl (pH 8.0), 300 mM NaCl, 20 mM imidazole and 10mM 
2-mercaptoethanol. The fusion protein was affinity-purified by Ni-Sepharose beads 
(GE Healthcare Life Sciences). The SUMO tag was removed by overnight digestion 
with homemade ULP1 protease at 4°C. The untagged protein was further purified 
by HiTrap Q anion exchange and Superdex G75 gel filtration chromatography (GE 
Healthcare Life Sciences). Selenomethionine-substituted (SeMet) GIDMA3 was 
expressed in the methionine auxotrophic E. coli strain B834 (DE3) and purified in 
the same way as the native protein. 

The engineered gasdermin proteins (GSDMD, GSDMA and GSDMA3) contain- 
ing the PreScission protease (PPase) recognition site were expressed and purified 
by following the same procedure as that for native gasdermin proteins. Inter- 
domain cleavage was performed by overnight digestion with homemade PPase at 
4°C. The proteins were further purified by Superdex G75 gel-filtration chroma- 
tography to obtain high-quality noncovalent complex of the cleaved gasdermin. 
To obtain the GIDMA-N domain alone, the noncovalent complex of GIDMA was 
further subjected to HiTrap Q anion exchange chromatography, followed by another 
round of Superdex G75 gel filtration chromatography. To obtain the gasdermin-C 
domain of GIDMA3 and GSDMA, the flow-through fractions of PPase-cleaved 
GSDMA3 and GSDMA proteins from Ni-Sepharose beads were subjected to 
HiTrap Q anion exchange and Superdex G75 gel filtration chromatography. To 
obtain GIDMD-C protein, E. coli BL21 (DE3) cells were transformed with pGEX- 
6P-2-GSDMD (residues 276-484). The GST-tagged protein was purified by affinity 
chromatography using glutathione-Sepharose beads (GE Healthcare Life Sciences) 
and the tag was removed by overnight digestion with PPase at 4°C. The proteins 
were further purified by Superdex G75 gel filtration chromatography. Recombinant 
PFO was expressed and purified by following the same procedure as that for the 
gasdermin protein. All the purified proteins were concentrated and stored in the 
buffer containing 20 mM HEPES (pH 7.5), 150 mM NaCl and 5 mM dithiothreitol. 
Bacterial growth inhibition and protoplast lysis. To assay the cytotoxicity of the 
gasdermin-N domain in E. coli, equal amounts of E. coli BL21 (DE3) cells were 
transformed with 0.1 1g of indicated plasmid. The transformed cells were seri- 
ally diluted and plated onto LB agar containing the appropriate antibiotics in the 
presence or absence of IPTG. The colony-forming unit (CFU) was determined by 
counting the number of viable bacteria per transformation after overnight culture 
at 37°C. 

To prepare the protoplast, B. megaterium cells were grown in AB3 medium 
(DIFCO) at 37°C until the OD¢oo reached 2.0. The centrifuged bacterial pellets, 
resuspended in the buffer containing 20 mM sodium malate (pH 6.5), 20 mM 
MgCl and 500 mM sucrose, were incubated with 2mg ml! lysozyme at 37°C 
until protoplast formation was complete (judged under phase-contrast micros- 
copy).The protoplasts were diluted to an OD¢00 of 1.0. To assay protoplast lysis, 
aliquots of the protoplasts (1.9 ml) were incubated with 100 11 of indicated gas- 
dermin proteins (final concentration of 0.6,1M) at 37°C for 30 min. To achieve 
complete lysis of the protoplast, 100,11 of 2% (v/v) Triton X-100 was added. The 
OD600 of each protoplast aliquot before and after incubation with the gasdermin 
protein was measured and defined as Apand A,, respectively, and that after Triton 
X-100 treatment was treated as Ajo9. The percentage of protoplast lysis is defined 
as: lysis (%) = (Ap — An) x 100/(Ap — Ajo). For the time-course plot of protoplast 
lysis, the OD¢oo of each protoplast aliquot was continuously recorded for 15 min 
at 1-min intervals before Triton X-100 was added. 

Liposome preparation. Phospholipids and phosphoinositides were dissolved in 
chloroform and chloroform-methanol mixture (20:9, v-v), respectively. Lipids with 
indicated compositions (0.5 zmol) were mixed in a glass vial. The solvent was 
evaporated under a stream of nitrogen, and the dry lipid film was then hydrated 
at room temperature with constant mixing in 5001 buffer 1 (20 mM HEPES 
(pH 7.5) and 150mM NaCl). Liposomes were generated by extrusion of the 
hydrated lipids through a 100-nm polycarbonate filter (Whatman) 35 times using 
the Mini-Extruder device (Avanti Polar Lipids Inc.). For Tb**-encapsulated lipos- 
omes, the lipid film was hydrated with 50011 buffer 2 (20 mM HEPES (pH 7.5), 
100 mM NaCl, 50mM sodium citrate and 15mM TbCl;). After the extrusion 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


process, Tb?* ions outside the liposome were removed by repeated washing with 
buffer 2 on a centrifugal filter device (Amicon Ultra-4, 100K MWCO, Millipore). 
The liposomes were subjected to buffer change into buffer 1 for use. To prepare 
dextran-encapsulated liposomes, the lipid film was hydrated in buffer 1 supple- 
mented with 2mg ml! dextrans. The liposomes were repeatedly washed to remove 
external dextran and then resuspended in 50011 buffer 1. All the liposomes were 
stored at 4°C and used within 48 h. 

Liposome-binding and leakage assays. The indicated gasdermin proteins (5 1M) 
were incubated with the indicated liposome (500 1M lipids) at room tempera- 
ture for 30 min in a total volume of 80 11. Samples were centrifuged in a Beckman 
Optima MAX-XP ultracentrifuge at 4°C for 20 min at 100,000g. The supernatant 
(S) was collected to examine proteins not bound to the liposome. The pellets (P) 
were washed twice with 10011 buffer 1 by re-centrifugation and brought up to the 
same volume as the supernatant. The S and P fractions were analysed by SDS- 
PAGE followed by Coomassie blue staining. 

For liposome leakage assay, aliquots of Tb**-encapsulated liposomes were 
diluted to 300M lipid concentration in 9011 of buffer 1 supplemented with 15 1M 
DPA. The excitation and emission wavelengths of 270 nm and 490 nm, respectively, 
were used to examine the Tb**/DPA chelates**. The emission fluorescence before 
adding the gasdermin protein was treated as Fi. 10,11 of protein was then added 
to a final concentration of 0.6,.M, and the emission fluorescence was continuously 
recorded as F; at 15-s intervals. After 20 min, 1011 of 1% Triton X-100 was added 
to achieve complete release of Tb**, and mean values of the top three fluorescence 
reads were defined as Fy199. The percentage of liposome leakage at each time point 
is defined as: leakage (t) (%) = (F;— Fio) X 100/(Frio0 — Fio). For dextran leakage 
assay, aliquots of dextran-encapsulated liposomes (300 1M of lipids) were incu- 
bated with 0.6 1M indicated proteins at room temperature for 30 min in a total 
volume of 100 il. After centrifugation, the released dextran in the supernatants 
was collected and the emission fluorescence (521 nm) after excitation at 494nm 
was measured as F,. The emission fluorescence of supernatants of untreated lipos- 
omes was measured as Fo, and that of the liposomes solubilized with 0.1% Triton 
X-100 was defined as Fo. The percentage of dextran leakage is defined as: leakage 
(%) = (Fn — Fo) x 100/(Fio0 — Fo). 

Crosslinking assays of gasdermin-N oligomerization. To assay gasdermin-N 
domain oligomerization in vitro, indicated PPase-cleaved engineered gasdermin 
proteins, before or after incubation with the liposome, were treated with 5mM glu- 
taraldehyde for 30 min at room temperature. The liposome pellets were solubilized 
in SDS loading buffer. Samples with or without crosslinking were analysed by SDS- 
agarose gel electrophoresis as previously described””. To assay the oligomerization 
during pyroptosis, relevant cells, before or after pyroptosis induction, were har- 
vested in PBS. The cell pellets were homogenized by passing through a 22G needle 
24 times. Cell lysates were centrifuged at 100,000g for 1h to obtain the supernatant 
and the pellet fractions. The pellet was homogenized by gentle sonication. Both 
fractions were treated with 2mM glutaraldehyde at room temperature for 15 min. 
Samples with or without crosslinking were analysed by both SDS-agarose and 
SDS-PAGE gel electrophoresis followed by immunoblotting. 

Electron microscopy and image processing. Gasdermin proteins (51M) were 
incubated with indicated liposomes (500 1M of lipids) at room temperature for 
30 min. Aliquots of the mixture (511) were transferred to carbon support films on 
electron microscopy grids and negatively stained with 2% uranyl acetate. Samples 
were imaged on a Tecnai T12 microscope (FEI) at 120 kV. Images were taken on 
a Gatan 4k x 4k CCD camera with a nominal magnification of 30,000 x, giving 
a final pixel size of 3.71 A. To prepare pores on lipid monolayers, solutions of 
noncovalent complexes of cleaved GSDMD (500 nM) and GSDMA3 (100 nM) 
were pipetted into Teflon wells and coated with a droplet of 1 mM lipid mixture 
containing 80% phosphatidylcholine and 20% cardiolipin in chloroform. Negative- 
stain electron microscopy of the GSDMA3 pores was performed and images were 
captured as above described. Complete and undistorted pore particles were manu- 
ally selected from the micrographs using EMAN2 (ref. 34). A total of 7,056 particle 


images were collected and normalized. After determining the defocus, the parti- 
cles were phase-flipped for contrast transfer function correction using EMAN2 
(ref. 34). Two-dimensional reference-free classification was then performed in 
Relion1.3 (ref. 35). More than 80% of class averages were pores of a uniform size 
around 30 nm in diameter. The averaged view with the best particle contrast of a 
class comprising 242 particles was selected and its rotational auto-correlations were 
calculated in SPIDER* to determine the symmetry. 
Structure determination. The crystallization experiments were performed using 
the sitting-drop vapour diffusion method at 20°C with 2-11 drops containing 
1l protein solution and 11] reservoir solution equilibrated over 10011 reser- 
voir solution. Initial crystallization hits of GIDMA3 were found from the Index 
Kit (Hampton Research) screen. Qualified crystals of SeMet-labelled GSDMA3 
were obtained in the reservoir buffer containing 100 mMBis-Tris (pH 6.5), 19% 
polyethylene glycol 3550, and 10mM TCEP within 2 weeks. For data collection, 
the crystals were soaked in a cryoprotectant solution containing the reservoir 
buffer supplemented with 20% ethylene glycol before flash-freezing with liquid 
nitrogen. Diffraction data were collected at the Shanghai Synchrotron Radiation 
Facility (Shanghai, China) beamline BL19U1 under a wavelength of 0.97776 A, 
and processed with the HKL 2000 suite*”. Phase determination by the single wave- 
length anomalous dispersion (SAD) method and automatic model building were 
performed in PHENIX™*. The rest of the model was manually built with Coot®’. 
The structure of GIDMA3 was refined in PHENIX, and manual modelling was 
performed between refinement cycles. The statistics of data collection and refine- 
ment are summarized in Extended Data Table 1.The quality of the final model was 
validated by MolProbity*’. Ramachandran statistics indicated that all the residues 
are in the allowed region, in which 97.25% fell into the favoured region. Each 
asymmetric unit contains one GSDMA3 molecule and the model covers residues 
1-453 of the 464 total residues. As well as the C-terminal tail being absent, four 
loops (residues 66-80, 170-178, 188-193 and 249-263) could not be modelled 
owing to the lack of interpretable electron density for these highly flexible loops. 
Homology-based modelling of GIDMD structure was performed with the pro- 
gram Modeller*! based on the sequence alignment of GIDMA3 and GSDMD. The 
structural model of GIDMA3 was completed by using Loop Refine in Modeller 
and used as the modelling template. The top hit of GIDMD models was evaluated 
using the DOPE statistical potential score’? and chosen for subsequent analysis 
and comparison with GSDMA3 structure. All structural figures were generated in 
PyMOL (http://www.pymol.org). 


31. Zhao, Y. et al. The NLRC4 inflammasome receptors for bacterial flagellin and 
type Ill secretion apparatus. Nature 477, 596-600 (2011). 

32. Gong, Y.-N. et al. Chemical probing reveals insights into the signaling 
mechanism of inflammasome activation. Cel! Res. 20, 1289-1305 (2010). 

33. Wilschut, J. & Papahadjopoulos, D. Ca*+-induced fusion of phospholipid 
vesicles monitored by mixing of aqueous contents. Nature 281, 690-692 
(1979). 

34. Tang, G. et a/. EMAN2: an extensible image processing suite for electron 
microscopy. J. Struct. Biol. 157, 38-46 (2007). 

35. Scheres, S. H. A Bayesian view on cryo-EM structure determination. J. Mol. Biol. 
415, 406-418 (2012). 

36. Frank, J. et al. SPIDER and WEB: processing and visualization of images in 3D 
electron microscopy and related fields. J. Struct. Biol. 116, 190-199 (1996). 

37. Otwinowski, Z. & Minor, W. in Methods in Enzymology Vol. 276 (eds Carter, C. W. Jr 
& Sweet, R. M.) 307-326 (Academic, 1997). 

38. Adams, P. D. et al. PHENIX: a comprehensive Python-based system for 
macromolecular structure solution. Acta Crystallogr. D 66, 213-221 (2010). 

39. Emsley, P,, Lohkamp, B., Scott, W. G. & Cowtan, K. Features and development of 
Coot. Acta Crystallogr. D 66, 486-501 (2010). 

40. Chen, V. B. et al. MolProbity: all-atom structure validation for macromolecular 
crystallography. Acta Crystallogr. D 66, 12-21 (2010). 

41. Eswar, N. et al. Comparative protein structure modeling using MODELLER. 
Curr. Protoc. Protein Sci. Chapter 2, Unit 2.9 (2007). 

42. Shen, M. Y. & Sali, A. Statistical potential for assessment and prediction of 
protein structures. Protein Sci. 15, 2507-2524 (2006). 


© 2016 Macmillan Publishers Limited. All rights reserved 


» 


GSDMD GSDMA 


GSDMA3 


Cell viability (%) 
[o>) 
Oo 


e& pe) 


\ 
Ni > 


\e) x 
re we 


GSDMA3 
1-261 


+ IPTG d 


GSDMD-FL 
CFUs (log10)/transformation 
ie) o ff a o 


GSDMD-N 


GSDMD-C 
CFUs (log10)/transformation 


Extended Data Figure 1 | Multiple gasdermin-N domains can induce 
mammalian cell pyroptosis and also exhibit cytotoxicity in bacteria. 

a, b, Full-length (FL) or N-terminal domain regions of different 
gasdermin-family members were transfected into 293T cells for 20h. 
Human GSDMD and mouse GSDMA3 had an N-terminal 3 x Flag tag and 


human GSDMA, GSDMB, GSDMC and DENAS had a C-terminal Flag tag. 


ATP-based cell viability is expressed as mean + s.d. from three technical 
replicates (a). Representative views of cell death morphology are shown 


+IPTG 


GSDMA3 


ARTICLE 


DFNAS5 


GSDMC 


NY S| 
< x tS) 


@-IPTG M@+IPTG 
1 
0 


GSDMD-FL 


GSDMD-N GSDMD-C 


@FL @-N 


DFNAS 


GSDMA GSDMC 
in b. c, d, Cytotoxicity of the gasdermin-N domain in bacteria. Indicated 
gasdermins were cloned into an IPTG-inducible vector for transformation 
into E. coli. c, Representative agar plates showing transformed E. coli 
colonies for GSDMD. d, Bacterial colony-forming units (CFU) per 
transformation for GSDMD and other gasdermins are shown in the 
logarithmic form (logio) as mean + s.d. from three technical replicates. 
All data shown are representative of three independent experiments. 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a Liposomes: 80% PC + 20% Phosphoinositides d Liposomes: 90% PC + 10% Membrane lipids 
is PI P(4,5)P2 Pl Paes AY  eiaeipa aa FE BL 2b = PEP PE 
— 5 PSP SuURLESELEEEEGINGEESm) 470i > > > baoshi Moh S Pos Poe Sab Sk 
70 = = 
55 we - = _ - - 
40 — —_ w= 40 
35 Gomes — - jo —<—s_ =<—— -| = =o 
25 25 —_ — —_— — — — 
15 - ae isi = a 
GSDMA-(N+C) GSDMA-N GSDMA-FL GSDMD-(N+C) GSDMA3-(N+C) GSDMA-N 
b Liposomes: 90% PC + 10% Phosphoinositides 


PI(3,4) Pl(4,5) PI(3,5) PI(3,4,5) PI(3,4) PI(4,5) PI(3,5) PI(3,4,5) 
PI PI3P —- PI4P_—s—~ PISP P2 P2 P2 P3 Pl PI3P-PI4P_—s— PI5P P2 P2 P2 P3 
a ee ee Ss PS P § PS P § P § P § P § P 
—— —_ 
aa a ——_— er or —_—— —-_ a 
= a — et a 
15 ee - 
GSDMD-(N+C) GSDMD-FL 
70 -_ 
55 ~~ —_ —_— wa _-_ - =| 
40 
35 —s =< — — Se eee 
25 ae — — — oe — — —_ 
15 = : 
GSDMA3-(N+C) GSDMA3-FL 
Cc Liposomes containing 45% PC + 35% PE + 5% PS + 5% PI + 10% PE or PI(4,5)P2 
kDa PE PI(4,5)P2 PE PI(4,5)P2 PE PI(4,5)P2 PE PI(4,5)P2 PE PI(4,5)P2 PE PI(4,5)P2 PE PI(4,5)P2 
70S (PSP RESP MSP “ Ss P S P =~ S P S P SPS PM&@SP SPm@S PSP 
= — — — - wee ee 
ms = -- — — ———  —_ bo — 
5 = i saa —-— — - 
25M ae _~ — 7 > — — a ie 
15 = _— se w we 
GSDMD-(N+C) GSDMD-FL GSDMA3-(N+C) GSDMA3-FL GSDMA-(N+C) GSDMA-N  GSDMA-FL 
Flag-GSDMD Flag-GSDMA3 
e ae, ee 
N+C EL N+C FL FL N+C 
Triglyceride (© OJ] Phosphatidylinositol (PI!) ; = 
: GSDMD-FL <<? 
Diacylglycerol (DAG) |CQ Oj] Pi(4)P a | 
Phosphatidic acid (PA) |©Q © | Pi(4,5)P2 3 @ | 
Phosphatidylserine (PS) |©Q ©} Pl(3,4,5)P3 e GSDMD-N = 
Phosphatidylethanolamine (PE) O O Cholesterol e@ iad GSDMA3-FL 
Phosphatidylcholine (PC) |} O| Sphingomyelin 
Phosphatidylglycerol (PG) O O 3-sulfogalactosylceramide (Sulfatide) GSDMA3-N = 
Cardiolipin |}©Q @ | Blank 
. @ i a-Flag 
f Liposomes made of Polar Lipid Extracts 
kDa Liver Brain Liver Brain Liver Brain 
7@mpo P SP MBS PsP Mm: - Ss Pas P Ss P mec PsPps PRcpspsp 
: = —_ —_ —_ hod —_ —_ — bed - ~ —) 
= —_ | esd —_ 
35 paid ~ ~ wo 
— — om» —_— — a= ~*~ ——_—- 
25 = — — — 
——? — om <=> — one 7 al — 
15 be - 
-(N+C) FL -(N+C) FL -(N+C) FL -(N+C) FL -N  -(N+C) FL -N -(N+C) FL 
GSDMD GSDMA3 GSDMA 


Extended Data Figure 2 | Membrane phospholipid binding of 

the gasdermin-N domain. a-d, f, Liposomes with indicated lipid 
compositions (a—d) or prepared using bovine liver or brain-derived 

polar lipid extracts (f) were incubated with purified gasdermin proteins. 
After ultracentrifugation, the liposome-free supernatant (S) and liposome 
pellet (P) were analysed by SDS-PAGE and Coomassie blue staining. 


e, Noncovalent complex of cleaved GIDMD and GSDMA3 with a Flag 
tag attached to the end of the gasdermin-N domain or the corresponding 
uncleaved full-length proteins were incubated with the lipid strips, and 
the strips were then probed with the anti-Flag antibody. Right, protein 
loading control. All data shown are representative of three independent 
experiments. 


© 2016 Macmillan Publishers Limited. All rights reserved 


iBMDM Gsdmad* + 2xFlag-HA-GSDMD 
- LPS +LPS 


\8} Ny .) oO 
Fraction Y gh OM EH gt e® WM E® 


GSDMD-FL <= 

GSDMD-N _——_ = es 

GSDMD-C on _ 
Actin ‘eas! -~” 
Cox4 ii — 


Tom20 =-— — 


Bright field 


00:00:00,000 03:00:0 1.36 | 


GSDMA3-N 
L184D-EGFP 


a 


293T cells + extracellular addition of protein 


100 
& 80 
2 60 
re} 
& 40 
> 
oO 20 
[o) 
0 
PFO FL (N+C)_ C FL (N+C) _FL N 
GSDMD GSDMA3_ GSDMA 
f — PFO =  GSDMD-(N+C) g 
= Control == GSDMD-FL Triton X-100 
10 == GSDMD-C ! 
: 1 
£ 
o 
5 08 
2 
2 
ao 06 
xe) 
cg 0.4 
e) 
0.2 
0 5 10 15 20 
Time (min) 


Extended Data Figure 3 | Biomembrane association and lysis by the 
gasdermin-N domain. a, b, Subcellular fractionation of the gasdermin-N 
domains of GIDMD and GSDMA during pyroptosis. Gsdmd~/~ iBMDMs 
expressing 2 x Flag and haemagglutinin (HA)-tagged GSDMD were 
untreated or stimulated with LPS electroporation (a). 293T cells expressing 
PPase-cleavable Flag-GSDMA were untreated or electroporated 

with purified PPase (b). Homogenized cell extracts were sequentially 
centrifuged at 700g, 20,000g and 100,000g to separate membrane fractions 
(P7, P20 and P100) from the $100 soluble fraction. The factions were 
immunoblotted as indicated. c, Microscopy of GIDMA3-N domain 
localization in cells undergoing pyroptosis. The gasdermin-N domain of 
GSDMA3 (GSDMA3-N(L184D)) fused N-terminally to eGFP was stably 
expressed in HeLa cells under a tetracycline-inducible promoter. Shown 
are representative time-lapse cell images (brightfield and fluorescence) 


Protoplasts lysis (%) 


ARTICLE 


+ PPase 


293T cell + Flag-GSDMA 
- PPase 


MB PPP on P Wh ah 


GSDMA-FL ee - 
GSDMA-N _—-—-—_ 
Actin’ 7 = 
- 


Cox4 com . 


Tom20 


Lamp2 


Na, K-ATPase a1 se -eo- = 


Syntaxin6 


(hr:min:sec) 


06:00:00,527 66: (6:00.66 


e 293T cell + protein electroporation 
400 GSDMD 100 - GSDMA3 100 GSDMA 
3 80 80 80 
2 60 60 
3 60 
$ 40 40 40 
8 20 20 20 
0 0 
FL (N+C) C FL (N+C) FL oN 
100 
80 
60 
40 
20 
0 
fe) NV dv Cv .& » OV & 
5 s 5 
FE OT Pe oh 
OM gp OY th I 


taken from 4-5 h after doxycycline addition. Scale bar, 15|1m. For videos 
of two representative cells, see Supplementary Videos 3 and 4. d, e, Effects 
of extracellular or intracellular delivery of purified gasdermin proteins on 
293T cell viability. Equal amounts of indicated gasdermin proteins or PFO 
were added directly into cell culture medium (d) or electroporated into the 
cytosol (e). ATP-based cell viability is expressed as mean + s.d. from three 
technical replicates. f, g, Bacterial protoplast lysis by purified gasdermin 
proteins. Protoplasts of B. megaterium were treated with indicated 
gasdermin proteins or PFO. Membrane lysis was assessed by measuring 
the ODgo0 of the protoplasts. Triton-X 100 treatment was used to achieve 
100% lysis of the protoplasts. Time-course measurement of GIDMD 
treatment is shown in f. Relative protoplast lysis by GSDMD and other 
gasdermins is expressed as mean + s.d. from three technical replicates (g). 
All data shown are representative of three independent experiments. 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a Liposomes: 80% PC + 20% PI(4,5)P2 


== GSDMD-(N+C) == PFO == GSDMA3-(N+C) == PFO 
== GSDMD-FL = CTL 1004 —= GSDMA3-FL —=CTL 
= GSDMA3-C 


— GSDMA-(N+C) — GSDMA-N 
—GSDMA-FL — PFO 
—=GSDMA-C —CTL 


-== GSDMD-C 


& 80 80 
3 
8 60 60 
2 
4 40 40 
Q 
= 
20 20 
0 0 
0 5 10 15 20 25 0 5 10 15 20 25 0 5 10 15 20 25 
Time (min) Time (min) Time (min) 
b Liposomes: 45% PC + 35% PE + 5% PS + 5% Pl + 10% PI(4,5)P2 
GSDMA-(N+C) GSDMA-N — 5xGSDMA-(N+C) — 5xGSDMA-N 
== 1x =— 3X = 1x me 3x == 5xGSDMA-FL == 5xPFO 
— om 5) — wm 5) w= 5xGSDMA-C = CTL 
100 = 100 - x x 
g 80 80 
3 60 60 
oO 
2 
£ 40 40 
a) 
F 20 20 
0 0 
0 5 10 15 20 25 0 5 10 15 20 25 0 5 10 15 20 25 
Time (min) Time (min) Time (min) 
Cc Liposomes: 45% PC + 35% PE + 5% PS + 5% PI + 10% PI(4,5)P2 or PE 
GSDMD-(N+C) GSDMA3-(N+C) GSDMA-(N+C) GSDMA-N 
— PI(4,5)P2 — PE — PI(4,5)P2 — PE — PI(4,5)P2  — PI(4,5)P2 
—P — 
100 100 100 5 PE 
S 80 80 80 
2 60 60 60 
2 
© 40 40 40 
a) 
- 20 20 20 
0 0 0 
0 5 10 15 20 25 0 5 10 15 20 25 0 5 10 15 20 25 
Time (min) Time (min) Time (min) 
Extended Data Figure 4 | Liposome-leakage-inducing activity of the fluorescence of released Tb**. Time course of relative Tb** release is 
gasdermin-N domain. a-c, Liposomes with indicated lipid compositions shown. A dose titration of GSDMA proteins is shown in b. CTL, control. 
were treated with purified gasdermin proteins or PFO as indicated. All data shown are representative of three independent experiments. 


Liposome leakage was monitored by measuring DPA chelating-induced 


© 2016 Macmillan Publishers Limited. All rights reserved 


75kDa 43kDa 29kDa Cc 


486 + oY y 


450 GSDMD-FL 


A280 nm (MAU) 


GSDMA3-FL 


Azo nm (MAU) 


750 GSDMA-FL 


Azo nm (MAU) 


40 0 _ 60 70 
Elution volume (ml) 


b Liposomes - + - - + + = + «= « 
Crosslinker - - - + = + & a = oF 


SDS-Agarose 


Extended Data Figure 5 | Membrane binding-induced oligomerization 
of and pore formation by the gasdermin-N domain. a, Gel filtration 
chromatography of full-length GIDMD, GSDMA and GSDMA3. 
Indicated gasdermin proteins were loaded on the Superdex G75 column. 
Arrows indicate elution volume of the molecular mass markers. 

b, Oligomerization of gasdermin-N domain on the liposome membrane. 
Indicated gasdermin proteins or PFO were incubated with cardiolipin or 
cholesterol liposomes, respectively. Intact proteins or proteins associated 
with the liposomes were mock treated or treated with glutaraldehyde 
and analysed by SDS-agarose gel electrophoresis and Coomassie blue 
staining. The gasdermin-C domain migrating at the bottom of the gel 
was omitted for clarity. c, Oligomerization of the gasdermin-N domain 


-LPS +LPS 


SDS- 
Agarose 


SDS- 
PAGE 


PFO GSDMA3-(N+C) PFO 


PI(4,5)P2 liposome (80% PC + 20% PI(4,5)P2) 


ARTICLE 


iBMDM Gsdma*- 
2xFlag-HA-GSDMD 


293T cell 
Flag-GSDMA 


- PPase + PPase 


Crosslinker - + - + - + - + - + 2. 


Oligomoer 


Monomer 


GSDMD/A-FL 


GSDMD/A-N 


O: Gasdermin-N oligomer 


M: Gasdermin-N monomer 


GSDMA-N 


GSDMA3-(N+C) 
ties TE Eee 


cs 
a 


GSDMA3 


io 


during pyroptosis. To trigger pyroptosis, Gsdmd~'~ iB MDMs expressing 

2 x Flag-HA-GSDMD and HeLa cells expressing the PPase-cleavable 
Flag-GSDMA were electroporated with LPS and PPase, respectively. 

The cytosol (S) and membrane (P) fractions from unstimulated and 
pyroptotic cells were subjected to glutaraldehyde-mediated crosslinking 
followed by SDS-agarose (top) or SDS-PAGE (bottom) gel electrophoresis. 
d, Pore-forming activity of the gasdermin-N domain. Liposomes with 80% 
phosphatidylcholine and 20% PtdIns(4, 5)P2 were treated with indicated 
gasdermin proteins. Shown are representative negative-stain electron 
microscopy micrographs of the liposomes (scale bar, 100 nm). All data 
shown are representative of three independent experiments. 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a 5 uM GSDMD-(N+C) 0.5 uM GSDMD-(N+C) d Cardiolipin monolayer membrane 
50 30 ERT 4 5) 
& 40 28 
3 20 
© 30 
g 15 
20 
2 10 
10 5 
Lad vy w : Vad » VL © 
~) D. 9 _® ’) \2) O 
ona ON no” ANCA er Ss ora oP aN no” LN AS, Cad DY wv 
Pore inner diameter (nm) Pore inner diameter (nm) 
b 5 uM GSDMA3-(N+C) 0.5 uM GSDMA3-(N+C) 
160 80 
o120 60 o 
S. z 
= + 
° 80 40 < 
8 24 
E Q 
2 40 20 o 
0 0 
Dad CN CACE Ny, Ae PP Lo »® of aan Se N PP 
Pore inner emce fait Pore inner diameter far), 
Cc 
60 Anthrax lethal toxin (NLRP1B inflammasome) 80 Bsak (NAIP2/NLRC4 inflammasome) 
50 70 
= 60 
3 40 50 
® 30 40 
2 
5 20 20 
os 20 
10 10 
. 1.2% 12% 1.2% 12% 1.2% 12% 1.2% 12% 1.2% 12% 1.2% 12% 
NC Oe. ZVAD NC ————— Oe. .ZVAD 
PEG200 PEG1500 PEG2000 PEG200 PEG1500 PEG2000 
e 
7 
5 4 
ow 
2 
fo) 
() 
-180° -120° -60° 0° 60° 120° 180° 


Rotational angle 


Extended Data Figure 6 | See next page for caption. 


© 2016 Macmillan Publishers Limited. All rights reserved 


Extended Data Figure 6 | Analyses of the gasdermin pore. a, b, Size 
distribution of GEDMD and GSDMA3 pores formed on cardiolipin 
liposomes. The inner diameters of pores were measured and plotted. A 
total of 200 or 100 pores for 541M or 0.541.M GSDMD/GSDMA3-treated 
liposome samples, respectively, were randomly selected from the negative- 
stain electron microscopy micrographs in Fig. 3a. c, Effects of different 
PEG molecules on lactate dehydrogenase (LDH) release from caspase- 
1-mediated pyroptotic cells. iB MDMs harbouring a sensitive Nirp1b 
allele were treated with indicated mass concentration of different PEG 
molecules and then stimulated with anthrax lethal toxin or LFn-BsaK 
to activate the canonical NLRP1B or NAIP2/NLRC4 inflammasomes, 
respectively. 1.2% PEG200 and 12% PEG2000 (mass concentration) 


ARTICLE 


have roughly the same molar concentration. Shown are LDH release 
expressed as mean + s.d. from three technical replicates. d, Pores formed 
by active GSDMD and GSDMA3 on monolayer membranes containing 
80% phosphatidylcholine and 20% cardiolipin. Shown are representative 
negative-stain electron microscopy micrograph images (scale bar, 100 nm). 
e, Symmetry determination of the gasdermin pore. GSDMA3 pores 
formed on the monolayer membrane (d) were subjected to 2D reference- 
free classification. One class of pores with the best particle contrast were 
subjected to rotational auto-correlation calculation and the inlet electron 
microscopy image (scale bar, 20 nm) shows the averaged view of the class 
of pores (242 particles). The analyses revealed 16-fold symmetry. Data 
shown in a-d are representative of three independent experiments. 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


GSDMA3-N 


d : 
No. PDB chain Z score RMSD (A) ee Identity (%) Molecule name 
residues 
4 3hvn-A 5.0 5.4 130 11 Suilysin 
2 4cdb-A 4.9 5.4 127 11 Listeriolysin O 
3 4qqa-A 4.9 5.9 130 ils Pneumolysin 
4 3cqf-A 47 5.9 127 10 Anthrolysin O 
5 1m3j-A 4.4 5.3 129 12 Perfringolysin O 
6 1s3r-A 4.3 5.6 129 11 Intermedilysin 
7 3kk7-A 3.4 5.2 116 9 MACPF from B. thetaiotaomicron 
8 2qp2-A 30 5.4 iid 11 MACPFfrom P. luminescens 
9 4wvm-B 3.2 3.4 84 4 Stonustoxin subunit beta 
10 4wvm-A Cal Cy) 85 45) Stonustoxin subunit alpha 
11 2qqh-A Chl Sul AS 5 Complement component C8 alpha chain 
az 3nsj-A 3.0 4.9 111 8 Perforin-1 
13 30jy-B 2.9 5.2 118 8 Complement component C8 beta chain 
14 4ov8-A 2.8 6.6 110 é Pleurotolysin B 
15 4e0s-B PAT 49 114 11 Complement component C6 
Extended Data Figure 7 | Crystal structure of GIDMA3 and Dali search = GSDMD; top, comparisons of the hydrophobic core (left) and the second 
results for its gasdermin-N domain. a, 2F, — F, electron density map inter-domain contact (right) with the corresponding structures in 
(contoured at 1.07) of GIDMA3 gasdermin-N domain (GSDMA3-N) GSDMAS3. Conserved residues involved in the autoinhibitory interactions 
structure. b, Cartoon diagram of GIDMA3-N structure. c, Structural are labelled and shown as sticks. Cyan, GSDMD-N; orange, GIDMD-C; 
model of GIDMD obtained from homology modelling and the conserved green, GSDMA3-N; yellow, GIDMA3-C. d, Dali search results for the 
autoinhibitory interactions. Bottom, overall structure of modelled GSDMA3-N structure. 


© 2016 Macmillan Publishers Limited. All rights reserved 


GSDMA3 


GSDMA3 
GSDMD 
GSDMA 
GSDMB 
GSDMC 
DFNAS 


GSDMA3 


GSDMA3 
GSDMD 
GSDMA 
GSDMB 
GSDMC 
DFNA5 


GSDMA3 


GSDMA3 
GSDMD 
GSDMA 
GSDMB 
GSDMC 
DFNAS 


GSDMA3 


GSDMA3 
GSDMD 
GSDMA 
GSDMB 
GSDMC 
DFNA5 


GSDMA3 


GSDMA3 
GSDMD 
GSDMA 
GSDMB 
GSDMC 
DFNA5S 


GSDMA3 


GSDMA3 
GSDMD 
GSDMA 
GSDMB 
GSDMC 
DFNAS 


GSDMA3 


GSDMA3 
GSDMD 
GSDMA 
GSDMB 
GSDMC 
DFNAS5 


GSDMA3 


GSDMA3 
GSDMD 
GSDMA 
GSDMB 
GSDMC 
DFNA5S 


PREHRHPP 


145 
151 
144 
149 
149 
149 


217 
227 
216 
218 
226 
230 


256 
278 
246 
246 
301 
298 


320 
348 
310 
311 
373 
374 


392 
421 
382 
378 
446 
444 


453 
481 
443 
409 
507 


ARTICLE 


al Bl B2 
220L0990990900 _ _ 
. MP VF EDIVT RIAL VREUNP RIGD LTP LSP DIF KIHF RIPEIC|LVIL K . S[T|LIFW). GAIRYIVRTDY| Gs|sPs .PILTDSG 
MGISAF BIRIVVRIRIVVQELDHGIGEFII|P ViTISBMQ S|S TIGF QPIYICILVIVIRIAP|S . S|SIWIF W). KPIRYIKCVNL DIAAEP .|DIVOIRGR 
.MITMF ENV TRIALARQLINPRIGD LIT LID|SHT DIF KIRF HPIFICILVILIRESRIK . SITILIFW]. GAIRYIVRTDY cs|sPs .|DIPTDTG 
ME|S VF FEI TRIVVKEMDAGIGDMIT|AVIRISBAV DIADIRF RIC|FIHIL VIGIENARIT . .|.).|FF/. GCIRHIYTTGL DIGDKWLIDELIDSG 
MPISMLEJRII SKINILVKEI|G . SIKDLITIP VIKIYpQL SIA TIK LRIQIF|V|I LIRIKEAK|D SRIS/S|F WIE Q SID YIVP VEF| S\SISVL .|EJTVIVTG 
.|. MF AKA T RINIF LREV|DADIGD LITIA V|SINBSN DS DIK LQLILS|LVTIKERK|. . RIEFWICWORPI|K YIOFLSL DIQFP sPlviVViESD 
A A AA A O@ 
B3 BS a2 a3 
-_> ce  —00000000...20 QQQQA.Q 


.. - NE|SIFIKNMLIDVQIVQGIL.}. 
DGQII QOGS|VEILI 
. . -NE|GEIKNMLIDTRVEGD}.|. 
LOGQKAE/F|Q I|L DINV DS T GE|LIT 
.. . . PIFIHIF/SDITMT QKHKADMG 
FV. ..LJKYIEGKFIANHVSGTLET 


. . - SIE|HIEY D 


E.. 
SMN 
Eee 
IK. s 
EB. . 


AP 
DP 
AP 
SQ 


SAILEINDK. . . 
NTWOQITLUL... 
KAILE|TV).... 
QyYuAlru. . 


KIERKL).SADHS/F LINE. 


- AAD/HP|F LIKE. 

-KREILPIF SIFR.|S| 
- DPIE/P SIF LIKE .|C 
»-NLURNPVLQOOVIL 


a4 
QAQAKD 
....NAIFS/LIPSLALILGL..... id 
. -EGSGRFS|LIPGATCIL. . . QGEGIQ) 
...-EACFSILIPFFAPILIGL..... e) 
As oe 6 24a RR YIKFWSQIIISQGHL. .|S 
...»-NILGKIJALWI TIY|GKGQGQOGE 
EEKCGGIVGIIJQTKTVIQVSATEDGINVTKDSNWVVLEIDPIAAITITIAYGVI ERY VIKLD 
5 
Bll 
BS ease ee sige ea eve: OS > 
ps Rete ear ereas Al oes ees WDIT]PY...ICNDSMQT .PPIKIRRVPCSA... . cc ccc eae ce eae FITSPTOMIS..... EEPE 
LDWVIL.LFPDKKIQIRT|FIOPPA.|.|... TGHKRSTSEGAWPIOLPSGLS .M.. wc. cee wc wc creas MIRCLHINFLTDGV...... 
Dice e es oa slaleslobe ax WDI|IIPH. . . ICNDNMQT .|JFPIPGEKSGEEK.... 2... ee ee eee \ Cais | a ee ee I 
ETMN . THF RGRIT|KKS|FIPEBK.].).........-..-2.-.)- -[.-- DGEAGS' CLiGK oe: gamer: x-75e) ere BS see eer ale Sarna cette, aay oa 
A.|T]L. ISDDDE|QRT|IFIQDEYERITISEMVGYCAARSEGLLPISFHTISPTLFNASSNDMKL...K.PELUFLTQOFLSGHLPKYEQV 
GQIEJEF CLLRGRIQGGEFIENKKRIJDSVYLDPLVFREFAFIDMPDAAH...GISSQDGPLSVLKQATLILLERNFH.......... 
as a6 a7 
QQQAKAARNAAOKOQAAKAH OAAAAKDAAAHAQHADA QAQH..... BOOWO. cs ve ees QQQNNA9O 


Oe Er RELe bani 


«+P 


mod ae 


EEKLIIGE... .MHEDF 
Pa » PAEGAFTEDE 
QASDN\GD....VHE\GF 

GS/E|D|IS 


apa|ecs <2 es KLE|GAJLDIK 
EAILIEQGQSLGPVE)PILDIG 
afefe es ea ae ALE|GAILD/K 
QRVSEV. LISGE|LIH|MEID 
- -LDSS/GH|LDG 


AELPEPQOQIIALSDIFQAVIL.FDDEILIL 
@ a A 
as 


ald 


TILK KMQQDSNIHAWENPIKIDP|TLY|L 
GIS|K QEjF M 


ID AIT L YIF|LIGRALITIE L T EIB 


OKIVITIUE PKDV.|LLUSKD 

RawlecBy uesguclve st I PWV YILILIGRAALITIML S E/T 
HE|V|TILIE ABP KDV .|LLISKEAVIGATLYIF 

DkP ils sPfrwaa LIVEARJAKKAIT LDIF 

GGIA 

EDA 


| 


KK SILE KIKIT LPIVQ 
EAILES/IQITLLIGP 
K SME KIKIT LP/|VQ 
EAILEKIGTLPILL 
CSMEKRITLLIOQ 


LKjL 
LE|L 
LK|L 
KD|O 
QEIL 


INMILIE. . 
AFILIOL.VGCSLO 


GIC PIG 


TCICKLIQUIPITLCHILLIRALSDDGVISDLEDPTLTPLKDTE 


Pee 
PrPHON 


all B13 B14 ol2 
> QROKOLQORO2022 —m _—_ LOQKNLKORAQHNHD2D ROR 
- |F|P|LIQIP|D LIS SLGEEELITILIT EAIL VIGL SIG LIEJVORIS|GIP|O YAWDPDITRHNILCALYAGILSLLIHL 
- MS|LIP|P|G LIGN SWGIE\G|. |AP/AWVIL LIDE CIG LIE|LIGE|DITIP]H VCIWE P JFAQGRMCALYAISILALLISG 
- |F|PILIQIPIE SSLGDEEILTILIT EAIL VIG LS|IG LE|VIORIS|GIPIQ YMWDPDITLPRILCALYAGILSLLIQQ) 
abelislioheleltsl lie se. aycere ie ata, wie, ale eke at woe te ebelle: oe collelelle Ph cee (oe «+ »|- -|- JLDIAS|SIPIPIDMDJYDPEJAR. IILCALYVIVIVSILILE 
- |E/TILIKIPIE|L} WAP LQSIEGILIAII|T Y GL LIEEC|IG LIRME LIDINP|IRS TWD VEIAKMPILSALYGTILSLLQQ| 
RE|GIVIQRILFIFASADISLERLKSSVKAVI|LKDSKVE|PILILILIC I TILNIGLC|ALIGIRIE HiS].|.]. . «|... ele es ele ee we fe le ee Peet A 
o* ¢ AA ¢ 

RKSNALTYCALS 

OB PS -s 3 ¢-s 50 0: -< 

WAS eo ecceuc eel tecece 

EGPTSVSS.... 

Pisa: tacos sai ce. oat er yas 


Extended Data Figure 8 | Multiple sequence alignment of gasdermin 
family members. GSDMA3 is a mouse protein and sequences of other 
gasdermins are from human. The secondary structures determined from 
GSDMA3 are marked along the sequence. The alignment was performed 
with the ClustalW2 algorithm with structure-based manual adjustment 
of the a4 region in GIDMA3 and GSDMD. Identical residues are 


highlighted by dark red background and conserved residues are coloured 
red. The residues involved in the autoinhibitory interactions are marked 
underneath the sequences with blue triangle for polar residues, orange 
rhombus for hydrophobic residues and black rhombus for hydrophobic 
residues in the second inter-domain interface. The residue number is 
indicated on the left of the sequence. 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a 293T cell transfection c == GSDMA3-(N+C) == GSDMA3 L184D-(N+C) 
ge ie —GSDMA3-FL — GSDMA3 E14K/L184D-(N+C) 
> 80 Cardiolipin liposome PI(4,5)P2 liposome 
= 100 
& 
2 = 80 
8 s 
2 60 
3xFlag- FL WT L184D E14K/ 
GSDMA3 Lis4D 40 
GSDMA3N 29 
a-Flag 0 
; 0 5 10 15 20 25 0 5 10 15 20 25 
Tubul 
Oo) Time (in Time (i 
b kDa PE CL PE CL PE CL Pl P1(4,5)P2 Pl P1(4,5)P2 Pl PI(4,5)P2 
ms P S POPs P S PRRs Ps © P S' PMs Pp Ss pas Pos P 
70 o - + - -_ 
55 bf a = pat o 
40. mal = ~ - -_ 
35 ee —) ——_—_—- oo ——_— ——— = 
c———— tw a a ee ee ee! 
15 — _ — _ -— 
GSDMA3 WT-(N+C) L184D-(N+C)  E14K/L184D-(N+C) WT-(N+C) L184D-(N+C)  E14K/L184D-(N+C) 
d GSDMA3-(N+C) GSDMA3 L184D-(N+C) GSDMA3 E14K/L184D-(N+C 


Extended Data Figure 9 | Mutations in GSDMA3-N affecting lipid 
binding and pore formation also reduce pyroptosis. a, Effects of L184D/ 
E14K mutations on pyroptosis-inducing activity of GIDMA3-N (residues 
1-284). Full-length GSDMA3 or its gasdermin-N domain (wild type 

or indicated mutants) was transfected into 293T cells. ATP-based cell 
viability is expressed as mean + s.d. from three technical replicates. 

b, c, Effects of L184D/E14K mutations on lipid-binding and liposome- 
leakage-inducing activities of GIDMA3-N domain. Liposomes containing 
80% phosphatidylcholine and 20% phosphatidylethanolamine, cardiolipin, 


on 


phosphatidylinositol or PtdIns(4,5)P2 were treated with purified 
GSDMAS3. After ultracentrifugation, the liposome-free supernatant (S) 
and the liposome pellet (P) were analysed by SDS-PAGE (b). Liposome 
leakage was monitored by measuring DPA chelating-induced fluorescence 
of released Tb** (c). Triton-X 100 treatment was used to achieve 100% 
leakage. d, Effects of L184D/E14K mutations on pore formation by 
GSDMA3-N. Representative electron microscopy images of the pores on 
the cardiolipin liposome are shown (scale bar, 100 nm). All data shown are 
representative of three independent experiments. 


© 2016 Macmillan Publishers Limited. All rights reserved 


Extended Data Table 1 | Data collection refinement statistics 


Se_GSDMA3(1-464) 


(S5B5R) 

Data collection 
Space group P2, 
Cell dimensions 

a, b,c (A) 43.453, 103.414, 49.625 

a, By (°) 90.00, 110.57, 90.00 
Wavelength (A) 0.97776 
Resolution (A)* 50.00-1.90 (1.93-1.90)* 
Rivage 0.071 (0.876) 
I/o(I) 25.6 (2.4) 
Completeness (%) 97.1 (92.7) 
Redundancy 6.6 (5.1) 
Refinement 
Resolution (A) 37.91-1.90 
No. reflections 44,110 
Ryork / Rice 0.1892/0.2319 
No. atoms 

Protein 3,264 

Water 218 
B factors 

Protein 28.36 

Water 33.91 
r.m.s deviations 

Bond lengths (A) 0.005 

Bond angles (°) 0.848 


One SeMet crystal was used for data collection and structure determination. 
«Values in parentheses are for highest-resolution shell. 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


ESA/DLR/FU BERLIN (G. NEUKUM) 


WILEY 


Martian moons 
formed in situ 


The moons of Mars may have 
formed from a disk of debris 
kicked up by the impact ofa 
giant meteorite on the planet. 
Astronomers have struggled 
to explain the existence 
of Phobos (pictured) and 
Deimos, the small, irregularly 
shaped moons of the red 
planet. One view is that they 
were asteroids captured 
by Mars. But a team led by 
Pascal Rosenblatt at the Royal 
Observatory of Belgium in 
Brussels tested an alternative 
idea using computer 
simulations of how orbiting 
debris, created by a giant 
impact, might coalesce. 
Relatively large moons form 
in the inner part of the disk 
thrown up by such a smash, 
and migrate outward, causing 
the outer part of the disk to 
coalesce into two bodies the 
sizes of Phobos and Deimos. 
The inner large moons are 
eventually dragged inward and 
fall back to Mars over 5 million 
years. 
Nature Geosci. http://dx.doi. 
org/10.1038/nge02742 (2016) 


Warming shifts 
plant sex ratio 


Climate change seems to be 
skewing the sex ratios of an 
alpine herb towards male 
plants. 

William Petry at 
the University of 
California, Irvine, 
and his colleagues 
analysed data on 


populations of the herb 
valerian (Valeriana edulis) 

in the Rocky Mountains of 
Colorado as the region became 
warmer and drier over the 
past few decades. They found 
that in 2011, plants at the 
highest elevations were only 
23% male, whereas at lower 
altitudes, where the climate is 
warmer and wetter, the plants 
were up to 50% male. Across 
9 populations at a variety 

of elevations, there was an 
average of 6% more males in 
2011 than in 1978. 

A higher male-to-female 
ratio could result in increased 
pollination — and therefore 
seed production — which 
could help the plants to 
expand their range as they 
adapt to climate change, the 
authors suggest. 

Science 353, 69-71 (2016) 


Soft wheels make 
robots tough 


Wheels built entirely from 
soft materials can help robots 
to roll over tricky terrain and 
resist damage. 

Aaron Mazzeo and his 
co-workers at Rutgers 
University in Piscataway, 
New Jersey, built a squishy 
wheel inspired by the inching 
motions of soft creatures such 
as earthworms. A stretchable 
ring contains multiple 
internal chambers, groups 
of which can be inflated and 
deflated sequentially around 
the circle. The pressurized 
compartments exert torque on 
a second, outer ring, causing it 
to turn. 

A soft robotic vehicle fitted 
with four of these wheels 
(pictured) travelled on a flat 
surface at 3.7 centimetres 
per second and kept moving 
after being dropped from 
eight times its height. The 
researchers also drove the 


RESEARCH HIGHLIGHTS BiiSaiaa¢ 


SOCIAL SELECTION 


Fake article webpages draw fire 


A debate is swirling around a tactic that academic publisher 
John Wiley & Sons uses to fight online piracy (see go.nature. 
com/299xily). The company created a webpage, accessible 
by several URLs, that looked like an academic paper to 
automated downloading programs. But any users who 
accessed the URLs were then blocked from viewing other Wiley 
content. Wiley and other publishers use these ‘trap’ URLs — 
which are invisible to most human users 


> NATURE.COM 

For more on 

popular papers: 
go.nature.com/29hhqog 


robot over a rocky landscape 
and underwater, and show 
that their concept can be 
modified to make winch 
rotors. 

Adv. Mater. http://doi.org/f3qjsh 
(2016) 


Leukaemia cells 
hide in fat tissue 


Cancer-causing stem cells 
evade chemotherapy by 
surviving in fat deposits 
around gonads. 

Fat tissue supports the 
growth of normal blood- 
forming stem cells. Craig 
Jordan of the University of 
Colorado Denver and his 
colleagues found that ina 
mouse model of one form 
of leukaemia, gonadal fat 
deposits contained numerous 
cancer stem cells, but 
subcutaneous fat had very 
few. Leukaemic cells induced 
the breakdown of gonadal fat, 
releasing nutrients that fuelled 
the growth of malignant 
cells in fat as well as other 
tissues. The cancer stem 
cells also expressed CD36, a 
cellular marker that boosts 
fat metabolism, helping to 
protect the cells from many 
chemotherapy drugs. 
Targeting fat metabolism 

could help to eradicate 
leukaemia stem cells, 
the authors suggest. 

Cell Stem Cell http://doi. 

org/bkqj (2016) 


— to detect and prevent unauthorized 
downloading and republishing of their 
content. But some users say that the 
tactic is too heavy-handed. 


Negative carbon 
emissions needed 


Countries’ existing promises 
regarding emissions 
reductions are unlikely to 
prevent global warming 
exceeding 2 °C above pre- 
industrial temperatures by the 
end of the century, meaning 
that large amounts of carbon 
may need to be removed from 
the atmosphere. 

Benjamin Sanderson 
and his co-workers at the 
US National Center for 
Atmospheric Research in 
Boulder, Colorado, explored 
the odds of staying below 
2°C of warming for a range of 
emissions pathways. They also 
analysed whether ‘negative 
emissions’ — the removal of 
carbon from the atmosphere 
— will be necessary. 

To avoid crossing the 
2-degree threshold during this 
century, net global emissions 
must drop to zero by 2085, 
the authors find. Depending 
on the level of near-term 
reductions, between 1.5 billion 
and 5 billion tonnes of CO, per 
year will need to be captured 
and removed from the 
atmosphere thereafter. 
Geophys. Res. Lett. http://doi. 
org/bkqh (2016) 


> NATURE.COM 

For the latest research published by 
Nature visit: 
www.nature.com/latestresearch 


7 JULY 2016 | VOL 535 | NATURE | 11 


© 2016 Macmillan Publishers Limited. All rights reserved. 


LETTER 


doi:10.1038/nature18627 


The quiescent intracluster medium in the core of the 


Perseus cluster 


The Hitomi collaboration* 


Clusters of galaxies are the most massive gravitationally bound 
objects in the Universe and are still forming. They are thus important 
probes! of cosmological parameters and many astrophysical 
processes. However, knowledge of the dynamics of the pervasive 
hot gas, the mass of which is much larger than the combined mass of 
all the stars in the cluster, is lacking. Such knowledge would enable 
insights into the injection of mechanical energy by the central 
supermassive black hole and the use of hydrostatic equilibrium for 
determining cluster masses. X-rays from the core of the Perseus 
cluster are emitted by the 50-million-kelvin diffuse hot plasma 
filling its gravitational potential well. The active galactic nucleus 
of the central galaxy NGC 1275 is pumping jetted energy into the 
surrounding intracluster medium, creating buoyant bubbles filled 
with relativistic plasma. These bubbles probably induce motions in 
the intracluster medium and heat the inner gas, preventing runaway 
radiative cooling—a process known as active galactic nucleus 
feedback-®. Here we report X-ray observations of the core of the 
Perseus cluster, which reveal a remarkably quiescent atmosphere 
in which the gas has a line-of-sight velocity dispersion of 164 + 10 
kilometres per second in the region 30-60 kiloparsecs from the 
central nucleus. A gradient in the line-of-sight velocity of 150 +70 
kilometres per second is found across the 60-kiloparsec image of 
the cluster core. Turbulent pressure support in the gas is four per 
cent of the thermodynamic pressure, with large-scale shear at most 
doubling this estimate. We infer that a total cluster mass determined 
from hydrostatic equilibrium in a central region would require little 
correction for turbulent pressure. 

The JAXA Hitomi X-ray Observatory’ was launched on 2016 
February 17 from Tanegashima, Japan. It carries the non-dispersive 
soft X-ray spectrometer (SXS)’, which is a calorimeter cooled to 
0.05 K giving a Gaussian-shaped energy response with a 4.9-eV full- 
width at half-maximum (FWHM; ratio of photon energy to FWHM, 
E/dE=1,250 at 6 keV) over a6 x 6 pixel array (total 3 arcmin x 3 arcmin). 
It operates over an energy range of 0.3-12 keV with X-rays focused by 
amirror’ with angular resolution of 1.2 arcmin (half-power diameter). 
A gate valve was in place for early observations to minimize the risk of 
contamination from outgassing of the spacecraft. It includes a beryl- 
lium window that absorbs most X-rays below about 3 keV. The SXS 
can detect bulk and turbulent motions of the intracluster medium by 
measuring Doppler shifts and broadening of the emission lines with 
unprecedented accuracy. It also allows the detection of weak emission 
lines or absorption features. 

The SXS imaged a 60 kpc x 60 kpc region in the Perseus cluster 
centred 1 arcmin to the northwest of the nucleus for a total exposure 
time of 230 ks. The offset from the nucleus was due to the attitude 
control system not having then been calibrated. For this early obser- 
vation, not all calibration procedures were available; in particular, 
we did not have contemporaneous calibration of the energy scale 
factors (gains) of the detector pixels. Gain variation over short time 
intervals was corrected using a separate calibration pixel illuminated 
by 5.9-keV Mn Ka photons from an *°Fe X-ray source. Gain values 


were pinned to an absolute scale via extrapolation of a subsequent 
calibration of the whole array 10 days later using illumination by 
another *’Fe source mounted on the filter wheel. (For more details, 
see Methods.) We used a subset of the Perseus data closest to that 
calibration to derive the velocity map. For the line-width determina- 
tion, we used the full dataset to minimize the statistical uncertainty, 
and applied a scale factor to force the Fe xxv Hea complex from the 
cluster to have the same energy in all pixels. This minimizes the gain 
uncertainty in the determination of the velocity dispersion, but also 
removes any true variations of the line-of-sight velocity of the intra- 
cluster medium across the field. 

A 5.5-8.5-keV spectrum of the full 3 arcmin x 3 arcmin field is 
shown in Fig. 1. This spectrum shows a thermal continuum with line 
emission! from Cr, Mn, Fe and Ni. The strongest lines are from Fe and 
consist of Fe xxv Hea, HeB and Hey complexes, together with H-like 
Fe xxvi Lyman a (Ly) lines. The total number of counts in the Fe xxv 
Hea line is 21,726, of which about 16 counts are expected from residual 
instrumental background. The line complex is spread over about 75eV 
and its major components include the resonance, intercombination and 
forbidden lines, all of which have been resolved. 

We adopt a minimally model-dependent method for spectral fit- 
ting, and represent the Fe xxv Hea, Fe xxv Hef and Fe xxvi Lya 
complexes in the spectrum with a set of Gaussians with free nor- 
malizations and energies fixed either at redshifted laboratory values 
(for He-like Fe) or theoretical values (for H-like Fe); see Extended 
Data Table 1. Figure 2 shows the profiles of these lines in a spectrum 
obtained from the outer region of the Perseus core, which excludes the 
active galactic nucleus (AGN) and prominent inner bubbles (Fig. 3). 
To measure the line-of-sight velocity broadening (Gaussian 7), we 
fitted the high-signal-to-noise, Fe xxv Hea line complex using nine 
Gaussians associated with lines known from atomic physics and 
obtain 164+ 10kms"! (all uncertainties are quoted at the 90% con- 
fidence level). The widths of the 6.7008-keV resonance line and the 
6.617-keV blend of faint satellite lines are allowed to be separate from 
the rest of the lines. The effect of the thermal broadening expected 
from the observed 4-keV plasma has been removed (alone it corre- 
sponds to 80 kms~'). Conservative estimates of the uncertainty in 
energy resolution (FWHM of 5 £0.5eV) result in a systematic uncer- 
tainty range in the turbulent velocity of only +6kms~', because the 
total measured line width is roughly twice the instrumental broad- 
ening, which adds to the astronomical broadening in quadrature. 
Uncertainties in plasma temperature add only a further +2kms"1. 
The statistical scatter in the energy-scale alignment of co-added pixels 
results in an overestimate of the true broadening by not more than 
3kms_!. The finite angular resolution of the telescope in the presence 
of a velocity gradient across the cluster results in a small artificial 
increase in the measured dispersion (see Methods) that is difficult 
to quantify at this stage. 

The Fe xxv Lya complex alone (554 counts) yields a consistent 
velocity broadening of 160 +16kms~'. A search for spatial varia- 
tions in velocity broadening using the Fe xxv Hea lines reveals that 


*A list of participants and their affiliations appears at the end of the paper. 


7 JULY 2016 | VOL 535 | NATURE | 117 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


5 
>= 
x= Rx 
g heed 
hg wo 
[ EI 
ol E 


Fe xxvi Lya 


S (counts s“! keV-1) 
— Fe | fluorescence 


— Crxin 
— Mn xxiv 


0.1 


— Fexxv HeB 


Figure 1 | Full array spectrum of the core of 
the Perseus cluster obtained by the Hitomi 


ar 88 observatory. The redshift of the Perseus cluster 

aoe < = is z=0.01756. The inset has a logarithmic scale, 
SE FS which allows the weaker lines to be better seen. 

fe 22 The flux S is plotted against photon energy E. 
Id | 


— Fe xxiv 


7.0 
E (observed) (keV) 


all 1-arcmin-resolution bins have broadening of less than 200kms". 
With just a single observation we cannot comment on how this result 
translates to the wider cluster core. 

The tightest previous constraint on the velocity dispersion of a cluster 
gas was from the XMM-Newton reflection grating spectrometer, 
giving!" an upper limit of 235 kms! on the X-ray coolest gas (that is, 
kT <3keV, where k is Boltzmann's constant and T is the temperature) in 
the distant luminous cluster A1835. These measurements are available 
for only a few peaked clusters'*; the angular size of Perseus and many 
other bright clusters is too large to derive meaningful velocity results 
from a slitless dispersive spectrometer such as the reflection grating 
spectrometer (the corresponding limit for Perseus! is 625 kms~'). The 
Hitomi SXS achieves much higher accuracy on diffuse hot gas owing 
to it being non-dispersive. 

We measure a slightly higher velocity broadening, 187+ 13kms7!, 
in the central region (Fig. 3a) that includes the bubbles and the 
nucleus. This region exhibits a strong power-law component from 
the AGN, which is several times brighter than the measurement!* 
made in 2001 with XMM-Newton, consistent with the luminosity 
increase seen at other wavelengths. A fluorescent line from neutral 
Fe is present in the spectrum (Fig. 1), which can be emitted by the 
AGN or by the cold gas present in the cluster core’. The intracluster 
medium has a slightly lower average temperature (3.8 + 0.1 keV) than 
the outer region (4.1 +0.1 keV). By fitting the lines with Gaussians, 


a 2.0 | \ ' ' ' 1 1 1 1 
Fe xxv Hea } 
z= 0.01756 

o, = 164 km s1 


1.5 


1.0 


S (counts s“! keV") 


0.5 


PT Ts 


6.55 
E (keV) 
Figure 2 | Spectra of Fe xxv Hea, Fe xxvi Lya and Fe xxv Hes from 
the outer region. a—c, Gaussians (red curves) were fitted to lines with 
energies (marked by short red lines) from laboratory measurements in 
the case of He-like Fe xxv (a, c) and from theory in the case of Fe xxv1 
Lya (b; see Extended Data Table 1 for details) with the same velocity 
dispersion (o,= 164kms-~!), except for the Fe xxv Hea resonant line, 


oft Le el 


118 | NATURE | VOL 535 | 7 JULY 2016 


we measured a ratio of fluxes in Fe xxv Hea resonant and forbidden 
lines of 2.48 + 0.16, which is lower than the expected value in opti- 
cally thin plasma (for kT = 3.8 keV, the current APEC!* and SPEX!” 
plasma models give ratios of 2.8 and 2.9-3.6) and suggests the pres- 
ence of resonant scattering of photons'®. On the basis of radiative 
transfer simulations’? of resonant scattering in these lines, such res- 
onance-line suppression is in broad agreement with that expected for 
the measured low line widths, providing independent indication of 
the low level of turbulence. Uncertainties in the current atomic data, 
as well as more complex structure along the line of sight and across 
the region, complicate the interpretation of these results, which we 
defer to a future study. 

A velocity map (Fig. 3b) was produced from the absolute energies 
of the lines in the Fe xxv Hea complex, using a subset of the data for 
which such a measurement was reliable, given the limited calibration 
(see Methods). We find a gradient in the line-of-sight velocities of about 
150+70kms~!, from southeast to northwest of the SXS field of view. 
The velocity to the southeast (towards the nucleus) is 48 + 17 
(statistical) ++ 50 (systematic) kms! redshifted relative to NGC 1275 
(redshift z= 0.01756) and consistent with results from Suzaku CCD 
(charge-coupled device) data”®. Our statistical uncertainty on relative 
velocities is about 30 times better than that of Suzaku, although there 
is a systematic uncertainty on the absolute SXS velocities of about 
50kms~! (see Methods). 


b c SS Se 
| Fe xxv Hef | 
& | 
oO 
as. L 
ty L 
n 
2 L 
a Li 
fe} 
8 L 
n L 
Ree 
r r t 
oli. oa py ae ae 
6.80 6.85 7.70 
E (keV) E (keV) 


which was allowed to have its own width. Instrumental broadening with 
(blue line) and without (black line) thermal broadening are indicated in 
a. The redshift (z= 0.01756) is the cluster value to which the data were 
self-calibrated using the Fe xxv Hea lines. The strongest resonance (‘w), 
intercombination (‘x; ‘y’) and forbidden (‘z’) lines are indicated. The error 
bars are 1 s.d. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Figure 3 | The region of the Perseus cluster observed by the SXS. 

a, The field of view of the SXS overlaid on a Chandra image. The nucleus 
of NGC 1275 is seen as the white dot with inner bubbles to the north and 
south. A buoyant outer bubble lies northwest of the centre of the field. 

A swirling cold front coincides with the second-most-outer contour. 

The central and outer regions are marked. b, The bulk velocity field across 
the imaged region. Colours show the difference from the velocity of 


NGC 1275 hosts a giant (80-kpc wide) molecular nebula seen in 
CO and Ha data with a total cold-gas mass of several 10°Mo, which 
dominates the total gas mass out to a 15-kpc radius. The velocities of 
that gas*!* are consistent with the trend of the SXS bulk shear, sug- 
gesting that the molecular gas moves together with the hot plasma. 
(More details of the X-ray spectra and imaged region are provided in 
Extended Data Figs 1-8.) 

The large-scale bulk shear over the observed 60-kpc field is of com- 
parable amplitude to the small-scale velocity dispersion that we derive 
for the outer region. The dispersion can be due to gas flows around the 
rising bubble at the centre of the field”?*, a velocity gradient in the 
cold front?® contained in this region, sound waves”°?’, turbulence?® 
or galaxy motions”’. The large-scale shear could be due to the buoyant 
AGN bubbles or to sloshing motions of gas in the cluster core that give 
rise to the cold front”. 

If the observed dispersion is interpreted as turbulence driven on 
scales comparable with the size of the largest bubbles in the field (about 
20-30 kpc), then it is in agreement with the level inferred”* from X-ray 
surface brightness fluctuations. In this case, our measured velocity dis- 
persion suggests that turbulent dissipation of kinetic energy can offset 
radiative cooling. However, assuming isotropic turbulence, the ratio of 
turbulent pressure to thermal pressure in the intracluster medium is 
low at 4%. Such low-velocity turbulence cannot spread far (<10 kpc) 
across the cooling core during the fraction (4%) of the cooling time 
in which it must be replenished, so the turbulent-dissipation mech- 
anism requires that turbulence be generated in situ throughout the 
core. Another process is needed to transport energy from the bubbling 
region. The observed level of turbulence is also sufficient to sustain 
the population of ultrarelativistic electrons that give rise to the radio 
synchrotron mini-halo observed in the Perseus core*”’. 

A low level of turbulent pressure measured for the core region 
of a cluster, which is continuously stirred by a central AGN and 
gas sloshing, is surprising and may imply that turbulence in the 
intracluster medium is difficult to generate and/or easy to damp. If 
true throughout the cluster, then this is encouraging for total mass 
measurements, which depend on knowledge of all forms of pres- 
sure support, and for cluster cosmology, which depends on accurate 
masses. 

The Hitomi spacecraft lost its ground contact on 2016 March 26, and 
later the recovery operation by JAXA was discontinued. 


the central galaxy NGC 1275 (whose redshift is z= 0.01756); positive 
difference means gas receding faster than the galaxy. The 1-arcmin pixels 
of the map correspond approximately to the angular resolution, but are 
not entirely independent (see Extended Data Fig. 5). The calibration 
uncertainty on velocities in individual pixels and in the overall baseline is 
50kms~! (Az=0.00017). 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 26 April; accepted 4 June 2016. 


1. Allen, S. W., Evrard, A. E. & Mantz, A. B. Cosmological parameters from 
observations of galaxy clusters. Annu. Rev. Astron. Astrophys. 49, 409-470 
(2011). 

2. Boehringer, H., Voges, W., Fabian, A. C., Edge, A. C. & Neumann, D. M. A ROSAT 
HRI study of the interaction of the X-ray-emitting gas and radio lobes of 
NGC 1275. Mon. Not. R. Astron. Soc. 264, L25-L28 (1993). 

3. Churazov, E., Forman, W., Jones, C. & Bohringer, H. Asymmetric, arc minute 
scale structures around NGC 1275. Astron. Astrophys. 356, 788-794 (2000). 

4. McNamara, B. R. et al. Chandra X-Ray observations of the Hydra A cluster: 
an interaction between the radio source and the X-Ray-emitting gas. 
Astrophys. J. 534, L135-L138 (2000). 

5. Fabian, A. C. et al. Chandra imaging of the complex X-ray core of the 
Perseus cluster. Mon. Not. R. Astron. Soc. 318, L65-L68 (2000). 

6. Fabian, A. C. Observational evidence of active galactic nuclei feedback. 

Annu. Rev. Astron. Astrophys. 50, 455-489 (2012). 

7. Takahashi, T. et al. The ASTRO-H X-ray astronomy satellite. Proc. SPIE 9144, 
914425 (2014). 

8. Mitsuda, K. et al. Soft x-ray spectrometer (SXS): the high-resolution cryogenic 
spectrometer onboard ASTRO-H. Proc. SPIE 9144, 91442A (2014). 

9. Soong, Y. et al. ASTRO-H soft X-ray telescope (SXT). Proc. SPIE 9144, 914428 
(2014). 

10. Tamura, T. et al. X-ray spectroscopy of the core of the Perseus cluster with 
Suzaku: elemental abundances in the intracluster medium. Astrophys. J. 705, 
L62-L66 (2009). 

11. Sanders, J. S., Fabian, A. C., Smith, R. K. & Peterson, J. R. A direct limit on the 
turbulent velocity of the intracluster medium in the core of Abell 1835 from 
XMM-Newton. Mon. Not. R. Astron. Soc. 402, L11-L15 (2010). 

12. Sanders, J. S., Fabian, A. C. & Smith, R. K. Constraints on turbulent velocity 
broadening for a sample of clusters, groups and elliptical galaxies using 
XMM-Newton. Mon. Not. R. Astron. Soc. 410, 1797-1812 (2011). 

13. Pinto, C. et al. Chemical Enrichment RGS cluster Sample (CHEERS): 
constraints on turbulence. Astron. Astrophys. 575, A38 (2015). 

14. Churazov, E., Forman, W., Jones, C. & Bohringer, H. XMM-Newton observations 
of the Perseus cluster. |. The temperature and surface brightness structure. 
Astrophys. J. 590, 225-237 (2003). 

15. Churazov, E., Sunyaey, R., Gilfanov, M., Forman, W. & Jones, C. The 6.4-keV 
fluorescent iron line from cluster cooling flows. Mon. Not. R. Astron. Soc. 297, 
1274-1278 (1998). 

16. Foster, A. R., Li, J., Smith, R. K. & Brickhouse, N. S. Updated atomic data and 
calculations for X-Ray spectroscopy. Astrophys. J. 756, 128 (2012). 

17. Kaastra, J. S., Mewe, R. & Nieuwenhuijzen, H. SPEX: a new code for spectral 
analysis of X & UV spectra. In 11th Colloquium on UV and X-ray Spectroscopy of 
Astrophysical and Laboratory Plasmas (eds Yamashita, K. & Watanabe, T.) 
411-414 (1996). 

18. Gil’fanov, M. R., Syunyaev, R. A. & Churazov, E. M. Radial brightness profiles of 
resonance x-ray lines in galaxy clusters. Sov. Astron. Lett. 13, 3-7 (1987). 


7 JULY 2016 | VOL 535 | NATURE | 119 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


19. Zhuravleva, |. et al. Resonant scattering in the Perseus cluster: spectral model 
for constraining gas motions with Astro-H. Mon. Not. R. Astron. Soc. 435, 
3111-3121 (2013). 

20. Tamura, T. et al. Gas bulk motion in the Perseus cluster measured with Suzaku. 
Astrophys. J. 782, 38 (2014). 

21. Salomé, P. et al. A very extended molecular web around NGC 1275. 

Astron. Astrophys. 531, A85 (2011). 

22. Hatch, N. A., Crawford, C. S., Johnstone, R. M. & Fabian, A. C. On the origin 
and excitation of the extended nebula surrounding NGC 1275. Mon. Not. R. 
Astron. Soc. 367, 433-448 (2006). 

23. Bruggen, M., Hoeft, M. & Ruszkowski, M. X-Ray line tomography of 
AGN-induced motion in clusters of galaxies. Astrophys. J. 628, 153-159 
(2005). 

24. Heinz, S., Bruggen, M. & Morsony, B. Prospects of high-resolution X-ray 
spectroscopy for active galactic nucleus feedback in galaxy clusters. 
Astrophys. J. 708, 462-468 (2010). 

25. Markevitch, M. & Vikhlinin, A. Shocks and cold fronts in galaxy clusters. 

Phys. Rep. 443, 1-53 (2007). 

26. Fabian, A.C. et al. A deep Chandra observation of the Perseus cluster: 
shocks and ripples. Mon. Not. R. Astron. Soc. 344, L43-L47 (2003). 

27. Ruszkowski, M., Briiggen, M. & Begelman, M. C. Cluster heating 
by viscous dissipation of sound waves. Astrophys. J. 611, 158-163 (2004). 

28. Zhuravleva, |. et a/. Turbulent heating in galaxy clusters brightest in X-rays. 
Nature 515, 85-87 (2014). 

29. Gu, L. et al. Probing of the interactions between the hot plasmas and galaxies 
in clusters from z = 0.1 to 0.9. Astrophys. J. 767, 157 (2013). 

30. ZuHone, J. A., Markevitch, M., Brunetti, G. & Giacintucci, S. Turbulence and 
radio mini-halos in the sloshing cores of galaxy clusters. Astrophys. J. 762, 
78 (2013). 


Acknowledgements We acknowledge all the JAXA members who have 
contributed to the ASTRO-H (Hitomi) project. All US members gratefully 
acknowledge support through the NASA Science Mission Directorate. 
Stanford and SLAC members acknowledge support via DoE contract to 

SLAC National Accelerator Laboratory DE-AC3-76SFO0515 and NASA grant 
NNX15AM19G. Part of this work was performed under the auspices of the 

US DoE by LLNL under contract DE-AC52-07NA27344 and also supported by 
NASA grants to LLNL. Support from the European Space Agency is gratefully 
acknowledged. French members acknowledge support from CNES, the Centre 
National d’Etudes Spatiales. SRON is supported by NWO, the Netherlands 
Organization for Scientific Research. The Swiss team acknowledges support of 
the Swiss Secretariat for Education, Research and Innovation SERI and ESA’s 
PRODEX programme. The Canadian Space Agency is acknowledged for the 
support of Canadian members. We acknowledge support from JSPS/MEXT 
KAKENHI grant numbers 15HO2070, 15K05107, 23340071, 26109506, 
24103002, 25400236, 25800119, 25400237, 25287042, 24540229, 
25105516, 23540280, 25400235, 25247028, 26800095, 25400231, 
25247028, 26220703, 24105007, 23340055, 15H00773, 23000004, 
15H0O2090, 15K17610, 15H05438, 15HO0785 and 24540232. H. Akamatsu 
acknowledges support of NWO via a Veni grant. M. Axelsson acknowledges a 
JSPS International Research Fellowship. C. Done acknowledges STFC funding 
under grant ST/LO0075X/1. P. Gandhi acknowledges a JAXA International Top 
Young Fellowship and UK Science and Technology Funding Council (STFC) 
grant ST/JO03697/2. H. Russell, A. C. Fabian and C. Pinto acknowledge support 
from ERC Advanced Grant Feedback 340442. We thank contributions by 
many companies, including, in particular, NEC, Mitsubishi Heavy Industries, 
Sumitomo Heavy Industries and Japan Aviation Electronics Industry. Finally, 
we acknowledge strong support from the following engineers. JAXA/ISAS: 

C. Baluta, N. Bando, A. Harayama, K. Hirose, K. Ishimura, N. lwata, T. Kawano, 
S. Kawasaki, K. Minesugi, C. Natsukari, H. Ogawa, M. Ogawa, M. Ohta, T. Okazaki, 
S.-i. Sakai, Y. Shibano, M. Shida, T. Shimada, A. Wada, T. Yamada; JAXA/TKSC: 
A. Okamoto, Y. Sato, K. Shinozaki, H. Sugita; Chubu U: Y. Namba; Ehime U: 

K. Ogi; Kochi U of Technology: T. Kosaka; Miyazaki U: Y. Nishioka; Nagoya U: 

H. Nagano; NASA/GSFC: T. Bialas, K. Boyce, E. Canavan, M. DiPirro, M. Kimball, 
C. Masters, D. Mcguinness, J. Miko, T. Muench, J. Pontius, P. Shirron, C. 
Simmons, G. Sneiderman, T. Watanabe; Noqsi Aerospace Ltd: J. Doty; Stanford 
U/KIPAC: M. Asai, K. Gilmore; ESA (Netherlands): C. Jewell; SRON: D. Haas, 

M. Frericks, P. Laubert, P. Lowes; U of Geneva: P. Azzarello; CSA: A. Koujelev, 

F. Moroso. 


Author Contributions The science goals of Hitomi (known as ASTRO-H before 
launch) were discussed and developed over more than 10 years by the 
ASTRO-H Science Working Group (SWG), all members of which are authors of 
this manuscript. All the instruments were prepared by joint efforts of the team. 
Calibration of the Perseus dataset was carried out by members of the SXS team. 
Data analysis and manuscript preparation were carried out by a small subgroup 
of authors appointed by the SWG, on the basis of the extensive discussion made 
in the white paper produced by all SWG members. The manuscript was subject 
to an internal collaboration-wide review process. All authors reviewed and 
approved the final version of the manuscript. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of the 
paper. Correspondence and requests for materials should be addressed to 
A.C. Fabian (acf@ast.cam.ac.uk). 


120 | NATURE | VOL 535 | 7 JULY 2016 


Hitomi Collaboration 

Felix Aharonian!2, Hiroki AKamatsuS, Fumie Akimoto‘, Steven W. Allen>®7, 
Naohisa Anabuki®, Lorella Angelini?, Keith Arnaud?1°, Marc Audard?}, 
Hisamitsu Awaki!?, Magnus Axelsson!3, Aya Bamba", Marshall Bautz!>, 
Roger Blandford®®7, Laura Brenneman!®, Gregory V. Brown!?’, 

Esra Bulbul!5, Edward Cackett!®, Maria Chernyakova!, Meng Chiao’, 

Paolo Coppi?’, Elisa Costantini?, Jelle de Plaa?, Jan-Willem den Herder, 
Chris Done?°, Tadayasu Dotani?!, Ken Ebisawa2!, Megan Eckart?, 

Teruaki Enoto2223, Yuichiro Ezoe!8, Andrew C. Fabian!8, Carlo Ferrigno!!, 
Adam Foster?®, Ryuichi Fujimoto?*, Yasushi Fukazawa?°, Akihiro Furuzawa’, 
Massimiliano Galeazzi?®, Luigi Gallo2’, Poshak Gandhi28, Margherita Giustini3, 
Andrea Goldwurm?%, Liyi Gu’, Matteo Guainazzi*!3°, Yoshito Haba?!, 

Kouichi Hagino?!, Kenji Hamaguchi?*?, Ilana Harrus?*, Isamu Hatsukade®s, 
Katsuhiro Hayashi2!, Takayuki Hayashi*, Kiyoshi Hayashida®, Junko Hiraga*4, 
Ann Hornschemeier®, Akio Hoshino®®, John Hughes*®, Ryo lizuka2!, 

Hajime Inoue?!, Yoshiyuki Inoue!, Kazunori Ishibashi*, Manabu Ishida??, 
Kumi Ishikawa3’, Yoshitaka Ishisaki!3, Masayuki Itoh?8, Naoko lyomoto?9, 
Jelle Kaastra?, Timothy Kallman®, Tuneyoshi Kamae®, Erin Kara?®, 

Jun Kataoka*®, Satoru Katsuda*!, Junichiro Katsuta2°, Madoka Kawaharada‘*?, 
Nobuyuki Kawai‘, Richard Kelley?, Dmitry Knangulyan®9, Caroline Kilbourne, 
Ashley King®®, Takao Kitaguchi®°, Shunji Kitamoto*°, Tetsu Kitayama“, 
Takayoshi Kohmura*®, Motohide Kokubun?!, Shu Koyama?!, 

Katsuji Koyama*®, Peter Kretschmar®°, Hans Krimm?47, Aya Kubota*8, 
Hideyo Kunieda’, Philippe Laurent?2, Francois Lebrun?2, Shiu-Hang Lee?!, 
Maurice Leutenegger®, Olivier Limousin?2, Michael Loewenstein??°, 

Knox S. Long*?, David Lumb®°, Grzegorz Madejski®’, Yoshitomo Maeda?!, 
Daniel Maier?2, Kazuo Makishima®!, Maxim Markevitch®, 

Hironori Matsumoto, Kyoko Matsushita®?, Dan McCammon*, 

Brian McNamara®®, Missagh Mehdipour®, Eric Miller!5, Jon Miller°®, 

Shin Mineshige??, Kazuhisa Mitsuda??, Ikuyuki Mitsuishi*, Takuya Miyazawa’, 
Tsunefumi Mizuno®, Hideyuki Mori?, Koji Mori??, Harvey Moseley, 

Koji Mukai?:32, Hiroshi Murakami®’, Toshio Murakami2‘, Richard Mushotzky!®, 
Ryo Nagino®, Takao Nakagawa?!, Hiroshi Nakajima®, Takeshi Nakamori®®, 
Toshio Nakano?’, Shinya Nakashima?!, Kazuhiro Nakazawa", 

Masayoshi Nobukawa®®, Hirofumi Noda’, Masaharu Nomachi®, 

Steve O’Dell®!, Hirokazu Odaka?!, Takaya Ohashi!?, Masanori Ohno?°, 
Takashi Okajima’, Naomi Ota®?, Masanobu Ozaki?!, Frits Paerels®°, 
Stephane Paltani!!, Arvind Parmar®°, Robert Petre®, Ciro Pinto!8, 

Martin Pohl!?, F. Scott Porter®, Katja Pottschmidt??, Brian Ramsey®!, 
Christopher Reynolds?°, Helen Russell!8, Samar Safi-Harb®*, Shinya Saito, 
Kazuhiro Sakai?, Hiroaki Sameshima!, Goro Sato2!, Kosuke Sato°3, 

Rie Sato2!, Makoto Sawada®®, Norbert Schartel®°, Peter Serlemitsos?, 

Hiromi Seta!?, Megumi Shidatsu>!, Aurora Simionescu!, Randall Smith?®, 
Yang Soong?, Lukasz Stawarz®®, Yasuharu Sugawara*!, Satoshi Sugita‘, 
Andrew Szymkowiak!9, Hiroyasu Tajima®’, Hiromitsu Takahashi2°, 

Tadayuki Takahashi2!, Shin’ichiro Takeda®®, Yoh Takei?!, Toru Tamagawa?’, 
Keisuke Tamura’, Takayuki Tamura2!, Takaaki Tanaka*®, Yasuo Tanaka?!, 
Yasuyuki Tanaka?°, Makoto Tashiro®, Yuzuru Tawara’, Yukikatsu Terada®?, 
Yuichi Terashima??, Francesco Tombesi?, Hiroshi Tomida?!, Yohko Tsuboi*!, 
Masahiro Tsujimoto2!, Hiroshi Tsunemi®, Takeshi Tsuru*®, Hiroyuki Uchida*®, 
Hideki Uchiyama’®, Yasunobu Uchiyama*®, Shutaro Ueda?!, Yoshihiro Ueda??, 
Shiro Ueno*!, Shin’ichiro Uno”!, Meg Urry!9, Eugenio Ursino?®, Cor de 
Vries’, Shin Watanabe2!, Norbert Werner®®, Daniel Wik?:72, Dan Wilkins2’, 
Brian Williams?, Shinya Yamada!3, Hiroya Yamaguchi®, Kazutaka Yamaoka’, 
Noriko Y. Yamasaki?!, Makoto Yamauchi*?, Shigeo Yamauchi®?, 

Tahir Yaqoob?2, Yoichi Yatsu*?, Daisuke Yonetoku2‘, Atsumasa Yoshida®®, 
Takayuki Yuasa®’, Irina Zhuravieva>® & Abderahmen Zoghbi°® 


Astronomy and Astrophysics Section, Dublin Institute for Advanced Studies, Dublin 2, 
Ireland. National Research Nuclear University (MEPHI), 115409 Moscow, Russia. 

3SRON Netherlands Institute for Space Research, Utrecht, The Netherlands. ‘Department of 
Physics, Nagoya University, Aichi 464-8602, Japan. *Kavli Institute for Particle Astrophysics 
and Cosmology, Stanford University, Stanford, California 94305, USA. Department o 
Physics, Stanford University, 382 Via Pueblo Mall, Stanford, California 94305, USA. 
7SLAC National Accelerator Laboratory, 2575 Sand Hill Road, Menlo Park, California 94025, 
USA. 8Department of Earth and Space Science, Osaka University, Osaka 560-0043, Japan. 
°NASA/Goddard Space Flight Center, Greenbelt, Maryland 20771, USA. !°Department of 
Astronomy, University of Maryland, College Park, Maryland 20742, USA. "Université de 
Genéve, 1211 Genéve 4, Switzerland. !*Department of Physics, Ehime University, Ehime 
790-8577, Japan. '2Department of Physics, Tokyo Metropolitan University, Tokyo 192-0397, 
Japan. !4Department of Physics, University of Tokyo, Tokyo 113-0033, Japan. !®Kavli Institute 
for Astrophysics and Space Research, Massachusetts Institute of Technology, Cambridge, 
Massachusetts 02139, USA. !©Smithsonian Astrophysical Observatory, 60 Garden Street, 
MS-4, Cambridge, Massachusetts 02138, USA. !”7Lawrence Livermore National Laboratory, 
Livermore, California 94550, USA. !Institute of Astronomy, Cambridge University, Cambridge 
CB3 OHA, UK. !9Yale Center for Astronomy and Astrophysics, Yale University, New Haven, 
Connecticut 06520-8121, USA. 2°Department of Physics, University of Durham, Durham 

DH1 3LE, UK. "Institute of Space and Astronautical Science (ISAS), Japan Aerospace 
Exploration Agency (JAXA), Kanagawa 252-5210, Japan. 2@Department of Astronomy, Kyoto 
University, Kyoto 606-8502, Japan. *?The Hakubi Center for Advanced Research, Kyoto 
University, Kyoto 606-8302, Japan. 24Faculty of Mathematics and Physics, Kanazawa 
University, Ishikawa 920-1192, Japan. 2°Department of Physical Science, Hiroshima 
University, Hiroshima 739-8526, Japan. 2°Physics Department, University of Miami, Miami, 
Florida 33124, USA. 2”?Department of Astronomy and Physics, Saint Mary's University, 


© 2016 Macmillan Publishers Limited. All rights reserved 


Halifax, Nova Scotia B3H 3C3, Canada. 28Department of Physics and Astronomy, University 
of Southampton, Highfield, Southampton SO17 1BJ, UK. 2°IRFU/Service d’Astrophysique, 
CEA Saclay, 91191 Gif-sur-Yvette Cedex, France. ?°European Space Agency (ESA), European 
Space Astronomy Centre (ESAC), Madrid, Spain. 3!Department of Physics and Astronomy, 
Aichi University of Education, Aichi 448-8543, Japan. 32Department of Physics, University 
of Maryland, Baltimore County, 1000 Hilltop Circle, Baltimore, Maryland 21250, USA. 
33Department of Applied Physics and Electronic Engineering, University of Miyazaki, 
Miyazaki 889-2192, Japan. 34Department of Physics, School of Science and Technology, 
Kwansei Gakuin University, Hyogo 669-1337, Japan. *°Department of Physics, Rikkyo 
University, Tokyo 171-8501, Japan. 3°Department of Physics and Astronomy, Rutgers 
University, Piscataway, New Jersey 08854-8019, USA. 3’RIKEN Nishina Center, Saitama 
351-0198, Japan. 3®Faculty of Human Development, Kobe University, Hyogo 657-8501, 
Japan. 3®Kyushu University, Fukuoka 819-0395, Japan. “Research Institute for Science 

and Engineering, Waseda University, Tokyo 169-8555, Japan. *!Department of Physics, 
Chuo University, Tokyo 112-8551, Japan. 42Tsukuba Space Center (TKSC), Japan Aerospace 
Exploration Agency (JAXA), Ibaraki 305-8505, Japan. “?Department of Physics, Tokyo Institute 
of Technology, Tokyo 152-8551, Japan. “Department of Physics, Toho University, Chiba 
274-8510, Japan. “SDepartment of Physics, Tokyo University of Science, Chiba 278-8510, 
Japan. “Department of Physics, Kyoto University, Kyoto 606-8502, Japan. *’Universities 


Space Research Association, 7178 Columbia Gateway Drive, Columbia, Maryland 21046, USA. 


48Department of Electronic Information Systems, Shibaura Institute of Technology, Saitama 
337-8570, Japan. *°Space Telescope Science Institute, Baltimore, Maryland 21218, USA. 
50European Space Agency (ESA), European Space Research and Technology Centre (ESTEC), 


LETTER 


2200 AG Noordwijk, The Netherlands. °!RIKEN, Saitama 351-0198, Japan. 5*Kobayashi- 
Maskawa Institute, Nagoya University, Aichi 464-8602, Japan. °?Department of Physics, 

Tokyo University of Science, Tokyo 162-8601, Japan. °*Department of Physics, University of 
Wisconsin, Madison, Wisconsin 53706, USA. University of Waterloo, Waterloo, Ontario 

N2L 3G1, Canada. °°Department of Astronomy, University of Michigan, Ann Arbor, Michigan 
48109, USA. 5’7Department of Information Science, Faculty of Liberal Arts, Tohoku Gakuin 
University, Miyagi 981-3193, Japan. °*Department of Physics, Faculty of Science, Yamagata 
University, Yamagata 990-8560, Japan. °°Department of Teacher Training and School 
Education, Nara University of Education, Takabatake-cho, Nara 630-8528, Japan. Research 
Center for Nuclear Physics (Toyonaka), Osaka University, 1-1 Machikaneyama-machi, Toyonaka, 
Osaka 560-0043, Japan. ©!NASA/Marshall Space Flight Center, Huntsville, Alabama 35812, 
USA. ©*Department of Physics, Faculty of Science, Nara Women’s University, Nara 630-8506, 
Japan. ®8Department of Astronomy, Columbia University, New York, New York 10027, USA. 
©4Department of Physics and Astronomy, University of Manitoba, Winnipeg, Manitoba R3T 2N2, 
Canada. Department of Physics and Mathematics, Aoyama Gakuin University, Kanagawa 
252-5258, Japan. Astronomical Observatory, Jagiellonian University, 30-244 Krakéw, Poland. 
§7Institute of Space-Earth Environmental Research, Nagoya University, Aichi 464-8601, Japan. 
S8Advanced Medical Instrumentation Unit, Okinawa Institute of Science and Technology 
Graduate University (OIST), Okinawa 904-0495, Japan. ©*Department of Physics, Saitama 
University, Saitama 338-8570, Japan. Science Education, Faculty of Education, Shizuoka 
University, Shizuoka 422-8529, Japan. ’!Faculty of Health Sciences, Nihon Fukushi University, 
Aichi 475-0012, Japan. ’*Department of Physics and Astronomy, Johns Hopkins University, 
Baltimore, Maryland 21218, USA. 


7 JULY 2016 | VOL 535 | NATURE | 121 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


METHODS 


Gain corrections and calibration. Gain scales for each pixel were measured in 
ground calibration using a series of fiducial X-ray lines at several detector heat- 
sink temperatures (a single spectral energy reference is sufficient to determine 
the effective detector temperature and thus the appropriate gain curve to use). 
As the heat-sink temperature varies, the gain of each pixel tracks the gain change in 
the separate calibration pixel that is continuously illuminated by a dedicated *°Fe 
source. However, time-varying differential thermal loading of the pixels changes 
their gains by different factors. Thus, use of the gain history of the calibration pixel 
alone can be insufficient to correct the gain scale of the main array. 

The Perseus observation used for this work was performed in two parts, 7 days 
apart, during which the gain of the calibration pixel changed by 0.6%. Ten days after 
the final observation, a fiducial measurement for the full array was obtained with 
an on-board “Fe source mounted on a filter wheel. To relate this calibration to the 
two Perseus observations, a two-stage approach was used. First, a correction factor 
was applied to all pixels using the gain history of the calibration pixel. Second, 
the differential pixel-pixel gain error was removed using the science observation 
itself. To do this, the two Perseus observations were subdivided, and the He-like 
Fe complex was fitted for each pixel in each subset. The time-dependent relative 
gain of each pixel (compared to the gain correction of the calibration pixel) was 
then linearly fitted and extrapolated to the later full-array calibration. The full 
dataset was then corrected using this time-dependent gain function, and the fitting 
errors were incorporated into the error analysis. To validate this approach, we 
compared the first observation, which required a substantial gain correction, to 
the second, for which the instrument was much closer to thermal equilibrium and 
thus required much less correction. In the first case, the bulk velocity uncertainties 
are dominated by the uncertainties in the gain correction, whereas, in the second, 
the uncertainties are dominated by the fit to the He-like Fe complex. The results 
for the two datasets agree for both bulk velocity and velocity dispersion, indicating 
that this is a robust approach. For the absolute velocity maps, we are presenting 
only the result from the second observation of the two used in this work, which 
requires the least correction and thus has the smallest uncertainty. Note that the 
limited gain calibration results in pixel-to-pixel uncertainty of 50kms~! on the 
absolute velocities. 

To derive the absolute velocities of the cluster, we applied the heliocentric cor- 
rection, which was —26.4kms"! for the observation used for velocity mapping. 
The orbital motion of the satellite around Earth averages out. Our velocities are 
compared to the heliocentric velocity of NGC 1275 in Fig. 3 and Extended Data 
Fig. 6. 

An additional validation of our calibration comes from a weak background 
line in the whole-array spectrum from stray Fe X-rays, which, after the above 
procedure, is observed at the correct energy to +1.8eV (equivalent to +90kms"!). 
Although the line is not strong enough to verify the calibration of individual pixels 
(because there should be about 68 counts in this line, non-uniformly distributed 
across the array), it is a convincing check of the approach. 

To determine velocity dispersion, we applied additional scale factors for each 
SXS pixel to match the apparent energies of the cluster Fe Hea complex in order 
to remove any residual gain errors at the relevant energy. This also removes the 
effect of true bulk shear. Pixels were then combined in physically relevant regions 
to minimize statistical uncertainties. 

We presumed a fixed energy resolution of 5.0-eV FWHM in all the analyses. 
By comparing the line widths in the first and second parts of the observation to 
estimate the broadening from residual gain drift, and accounting for the variation 
in resolution of the calibration pixel in time over the observation and during the 
later calibration of the array, we estimate that the composite resolution of the array 
and of the separately analysed central and outer regions is bounded with high con- 
fidence between 4.5-eV and 5.5-eV FWHM. This 10% uncertainty in instrumental 
broadening produces a much smaller fractional uncertainty in velocity broadening 
because the instrumental broadening is roughly half as large as the astronomical 
broadening, and adds in quadrature with it. 

The error from gain aligning the different pixels in a region is smaller than the 
uncertainty in instrumental broadening because of the small statistical errors in 


the determination of the scale factor at the Fe Hea complex (in an outer pixel, 
equivalent to 30kms ! at 90% confidence). Adding the spectra of multiple pixels 
with the same velocity uncertainty will add 30 kms’ of noise in quadrature with 
the measured broadening, producing an overestimate by no more than 3kms_1. 

Our velocity dispersion measurements exclude velocity variations across the 

field on scales of 20 kpc and greater because of the aforementioned self-calibration 
procedure, but integrate over all scales along the line of sight (weighted by X-ray 
emissivity, which essentially limits integration to the cluster core). Any comparison 
with simulations will have to take these into account. 
Effects of angular resolution. The point spread function (PSF) of the telescope 
has a 1.2’ half-power diameter as measured during ground calibration. This means 
that regions used for spectral extraction get photons not only from the corre- 
sponding cluster regions in the sky, but also from the surrounding regions. The 
PSF image is shown in the right panel of Extended Data Fig. 5, centred on the 
SXS pixel that contains the cluster peak. By comparing the PSF with the middle 
panel of Extended Data Fig. 5, which shows the image in the Fe Hea line (which 
comes mostly from the gas, as opposed to the central AGN), we see that the diffuse 
emission of the cluster is resolved. However, small regions in the detector, such as 
the 1’ x 1’ regions of the velocity map shown in Fig. 3b and Extended Data Fig. 6, 
are significantly correlated. The fraction of the emission that originates in a given 
1’ cluster region and ends up in the corresponding 1’ detector region is 36%-37%, 
with the rest spreading over the surrounding regions. For example, for the region 
marked ‘“—60’ in Extended Data Fig. 6, the scattered contribution from the neigh- 
bouring region marked ‘78’ is 23% of the flux that originates in region —60 itself; 
the contribution from —60 into 78 is a similar 22% of the flux that originates and 
stays in 78. Regions adjacent to the brightness peak (which is in region marked 
“48’) are most affected—the region marked ‘94 has a ratio of photons scattered in 
from 48 to its own photons of 27%. This means that the true line-of-sight velocity 
gradients ona 1’ scale have to be steeper than what we measure, but not by much. 
Scattered flux from an adjacent region with a large velocity difference (for example, 
from region 78 to region —60) should contribute lines at a different velocity in the 
spectrum, but such contributions would be very small compared to the observed 
line-of-sight velocity dispersion of >160kms~!. Correction of the PSF effects is 
left for future work. 

The PSF scattering also has the subtle effect of inflating our measured value 

of velocity dispersion. Although the self-calibration procedure that aligns the Fe 
Hea lines in each pixel (as described above) removes most of the velocity-gradient 
contribution from the measured velocity dispersion, it does so after the PSF scatter- 
ing has occurred and mixed the photons from regions with different line-of-sight 
velocities, so that contribution remains. 
Pointing. For this early observation, accurate pointing direction of the spacecraft 
was not available. We therefore assumed that the observed brightness peak in the 
SXS image is the AGN in NGC 1275. The resulting uncertainty of the sky coor- 
dinates should be less than 15”. The peak of the source determined in short time 
intervals revealed a small drift of the source in the detector image, within the above 
coordinate uncertainty. It causes image smearing that is insignificant compared to 
the effect of PSF scattering. 


31. Grevesse, N. & Sauval, A. J. Standard solar composition. Space Sci. Rev. 85, 
161-174 (1998). 

32. Fabian, A. C. et al. A wide Chandra view of the core of the Perseus cluster. 
Mon. Not. R. Astron. Soc. 418, 2154-2164 (2011). 

33. Ferruit, P, Binette, L., Sutherland, R. S. & Pencontal, E. Tiger observations of the 
low and high velocity components of NGC 1275. New Astron. 2, 345-363 
(1997). 

34. Conselice, C., Gallagher, J. G. & Wyse, R. G. On the nature of the NGC 1275 
system. Astron. J. 122, 2281-2300 (2001). 

35. Beiersdorfer, P. et al. High-resolution measurements, line identification, 
and spectral modeling of Ka transitions in Fe xviii-Fe xxv. Astrophys. J. 409, 
846-859 (1993). 

36. Smith, A. J. et al. KG spectra of heliumlike iron from tokomak-fusion-test- 
reactor plasmas. Phys. Rev. A 47, 3073-3079 (1993). 

37. Johnson, W. R. & Soff, G. The lamb shift in hydrogen-like atoms, 1 < Z< 110. 
Atom. Data Nucl. Data 33, 405-446 (1985). 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


S, counts s7! keV7! 


3.5 6 6.5 7 


E (observed), keV 


Extended Data Figure 1 | SXS spectrum of the full field overlaid with a CCD spectrum of the same region. The CCD is the Suzaku X-ray imaging 
spectrometer (XIS) (red line); the difference in the continuum slope is due to differences in the effective areas of the instruments. 


1:2 8 8.5 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Fe XXV Hea 


APEC 
SPEX 


on 


S, counts s7! keV-! 


0.5 


0.3 
Fe XXVI Lya 


2 
N 


S, counts s7' keV7! 


0.1 


6.80 


6.85 
E, keV 


Extended Data Figure 2 | The iron line complexes from the outer region 
compared with best-fit models. ac, These have been obtained from 
various emission-line databases typically used in the literature. The spectra 
were modelled as a single-temperature, optically thin plasma in collisional 
ionization equilibrium using either APEC/ATOMDB 3.0.3 (ref 16; red) or 
SPEX 3.0 (ref. 17; blue). We determined the best-fit model by fitting the 
Hitomi spectrum from the outer 23 pixels in the energy range 6.4-8 keV, 
excluding the Fe Hea resonance line and Ni Hea line complex. We obtain 
consistent best-fit parameters, with both APEC and SPEX predicting 

a temperature of 4.1 + 0.1 keV. The iron-to-hydrogen abundances are 


Fe XXV HeB 


7.70 7.75 


E, keV 


0.62 + 0.02 from APEC and 0.74 +0.02 from SPEX, relative to solar 
values*!. The line broadening obtained from APEC, 146+7kms |, is 
smaller than the best-fit SPEX value of 171+7kms~', although both 
values are consistent with the line broadening obtained by fitting a set of 
Gaussians (the result presented in the main body of the paper). Apart from 
the Fe Hea w line affected by resonance scattering (a), both emission line 
models presented here currently have difficulty reproducing the measured 
Fe Hea intercombination lines (a) as well as the exact position of the 

Fe He@ line (c). This motivates the model-independent approach adopted 
in the manuscript for determining the line widths. Error bars are 1 s.d. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Fe XXV Hea 


APEC 
Sr ex 


S, counts s~' keV"! 


6:5 b.20 6.6 
E, keV 

Extended Data Figure 3 | The Fe He-a line complex from the central the line of sight. The two spectral codes provide similar results with 
region around the AGN. The 5.0-8.5-keV spectrum was modelled with an average temperature of 3.8 + 0.1 keV and metallicity consistent with 
an isothermal, optically thin plasma in collisional ionization equilibrium the solar value. We obtain a velocity broadening of 156+ 12kms7! 
using either APEC/ATOMDB 3.0.3 (red) or SPEX 3.0 (blue), with an from APEC and 178+9kms! from SPEX. Both models suggest that 
additional power-law component accounting for emission from the central _ the resonant line has been suppressed in the central region. Error bars 
AGN. During the fit we excluded the Fe Hea resonance line because this are 1 s.d. 


can be affected by resonant scattering of photons by the intracluster gas in 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


200 


a, ,km s7! 


Fe XXV Hea 


100 Fe XXVI Lya 


Fe XXV HeB 


0.0175 0.018 
2 


Extended Data Figure 4 | Confidence contours for joint fits of redshift z and velocity broadening o, are compared. The three line complexes have 
been fitted independently. The contours are plotted at Xe in + 2.3 (68%, two parameters) and Xo + 6.17 (95%). The three fits give consistent redshifts 


(with the one to which the data were self-calibrated) and broadening. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


broad band Fe Heo line point source 


Extended Data Figure 5 | The spatial response of the SXS array. cluster plasma, and a model response of a point source centred in the pixel 
The total broadband counts (colour scale) seen across the detector array coincident with the nucleus of NGC 1275 (right) are compared. Brightness 
(left), Fe Hea line counts (centre) that come mostly from the diffuse is normalized to the same peak value. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


S 


fF 


Extended Data Figure 6 | The line-of-sight gas velocities overlaid on a figure (numbers in the smaller font) are statistical only; our estimate of 
deep Chandra image. The Chandra image is from ref. 32. The contours the calibration uncertainty in individual pixels is 50kms~!. Heliocentric 
increase by a factor of 1.5. The numbers in the larger font indicate the correction has been applied. Velocities are shown relative to that of 
velocity in each region (see also the colour scale). The 90% errors in the NGC 1275, whose redshift is z= 0.01756 (ref. 33). 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Figure 7 | The SXS field overlaid on the cold gas CO data?! decreases, south to north (within the SXS field of view), from 
nebulosity surrounding NGC 1275. The image shows Ha emission™. about +50kms~' to —65kms"!. This is similar to the trend seen in the 


The radial velocity along the long northern filament measured from SXS velocity map (Extended Data Fig. 6). 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


4 T T T T T T T T T T 10 T T T T T T T T T T T T 
} a | b | 
r natural line shape q 
FWHM = 4.94 eV fF 4 
3 
tT 
> 
© L a” 
n 
€ 
a2 2 5-F 4 
Oo 
+ 
jo} L 4 
w 
1 
5.88 5.90 4 is] 6 
E, keV resolution (FWHM), eV 
Extended Data Figure 8 | In-flight spectral resolution of the SXS. line shows the expected natural line shape and the red line shows the 
a, The composite spectrum of all pixels (excluding the calibration pixel) observed profile (error bars are 1 s.d.). b, A histogram of pixel resolution. 
when they were exposed to the **Fe source on the filter wheel. The blue Nis the number of pixels sharing that resolution. 


© 2016 Macmillan Publishers Limited. All rights reserved 


Extended Data Table 1 | Line energies used in the Gaussian fits 


Energy (eV) (A) 


He-a 
multiplet 


6617.00 


6628.93 
6636.84 
6645.24 
6654.19 


6662.09 
6667.90 
6682.45 
6700.76 


H-like 
doublet 


6951.96 
6973.18 


He-B 
doublet 


7871.31 
7880.67 


Data are from refs 34-37. 


1.8737 


1.8704 
1.8681 
1.8658 
1.8633 


1.8610 
1.8594 
1.8554 
1.8503 


1.7834 
1.7780 


L579) 
1.5733 


Charge 


state 


XXII 
XXV 

XXIV 
XXIV 


XXIV 
XXV 
XXV 
xXXV 


XXVI 
XXVI 


XXV 
xXXV 


Transition 


1s2s’2p 'P, > 15°25” 'So 
1s2s °S;—> 1s” 'So 
1s2p? *Dsp > 1572p 7P3/2 


1s2s2p Py 1s°2s Sip 
1s2p? *D32. > 1572p *P1/2 


ls2s2p Pie — 1572s ore 
ls2p 3p, = 1s” 1S 
ls2p 3p =e iy 1S 
1s2p Pp, =% fy" IS 


2p °Pin— Isin Si 


2p *P3 > 1sy27Sin 


1s3p °P, > 1s? 'So 
1s3p 'P, > 1s? 'So 


Label Note 


Blend — identified 
in (34) as Be- and 
Li-like iron 


Be-like 
Hea(z) Forbidden 

Li-like 

Li-like blend 


Li-like 
Hea(y) Intercombination 
Hea(x) Intercombination 


Hea(w) Resonance 


Ly a2 
Ly al 


He B2 
He B1 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Ref 


35 


36 
36 


37 
37 


LETTER 


doi:10.1038/nature18314 


Photodissociation of ultracold diatomic strontium 
molecules with quantum state control 


M. McDonald!, B. H. McGuyer!, F. Apfelbeck!}, C.-H. Lee!, I. Majewska’, R. Moszynski? & T. Zelevinsky! 


Chemical reactions at ultracold temperatures are expected to be 
dominated by quantum mechanical effects. Although progress 
towards ultracold chemistry has been made through atomic 
photoassociation!, Feshbach resonances” and bimolecular 
collisions, these approaches have been limited by imperfect 
quantum state selectivity. In particular, attaining complete control 
of the ground or excited continuum quantum states has remained 
a challenge. Here we achieve this control using photodissociation, 
an approach that encodes a wealth of information in the angular 
distribution of outgoing fragments. By photodissociating ultracold 
88s, molecules with full control of the low-energy continuum, 
we access the quantum regime of ultracold chemistry, observing 
resonant and nonresonant barrier tunnelling, matter-wave 
interference of reaction products and forbidden reaction pathways. 
Our results illustrate the failure of the traditional quasiclassical 
model of photodissociation*’ and instead are accurately described 
by a quantum mechanical model®”. The experimental ability to 
produce well-defined quantum continuum states at low energies 
will enable high-precision studies of long-range molecular 
potentials for which accurate quantum chemistry models are 
unavailable, and may serve as a source of entangled states and 
coherent matter waves for a wide range of experiments in quantum 
optics!>!?, 

To obtain full control over the initial (molecular) and final (contin- 
uum) quantum states, we photodissociate diatomic strontium mole- 
cules (**Sr,) that are optically trapped at a temperature of ~5 1K (ref. 12). 
These molecules are produced by photoassociating laser-cooled Sr 
atoms in a far-off-resonant one-dimensional (1D) optical lattice with a 
depth of up to 50K. The Sr atoms are divalent and do not form cova- 
lent chemical bonds. However, the ground-state Sr dissociation energy 
is larger than in typical van der Waals complexes and similar to hydro- 
gen bonded systems such as the water dimer. The ®*8Sr molecules that 
we produce are either weakly bound near the ground state threshold 
('S + 1S atomic limit) or, with an extra step of optical preparation, 
near the lowest singly excited threshold ('S + *P;). The long-lived 
(221s) excited atomic state 3P, is responsible for the low laser- 
cooling temperature, efficient molecule creation, accurate state prepa- 
ration and high spectroscopic resolution that allows photodissociation 
very close to the threshold. Photodissociation is driven by a 10-20 j1s 
pulse of linearly polarized 689 nm light (intensity 0.3-30 W cm~’, 
bandwidth <200 Hz) propagating along the lattice axis. The light 
frequency is chosen to probe a continuum energy in the range of 
0-15 mK because this matches typical electronic and rotational bar- 
rier heights. After a controlled delay, the fragments are detected by 
absorption imaging via the strong 'S — 'P, Sr transition using 461 nm 
light propagating almost parallel to the lattice axis, so that the initial 
sample of >10* molecules appears as a point source. This produces a 
2D projection of the 3D spherical shell (Newton sphere) formed by 
the expanding fragments. The experimental geometry is illustrated 
in Fig. 1, including the definition of angles @ and ¢ for a dissociating 


molecule and an image of the fragments showing clear dependence 
on both angles. The arrangement of the optical lattice, the photo- 
dissociating and imaging light, a camera and a small bias magnetic 
field B that fixes the quantum axis are also shown. In all subsequent 
images the colour scheme is identical to that of Fig. 1b apart from 
the overall normalization and the fields of view are 0.1-0.9mm on 
each side. 

Following photodissociation, the angular distribution of fragment 
positions is described by an intensity (or differential cross-section) 


18,6) =[f(@9)/ (1) 
which is the square of a scattering amplitude f that can be expanded 


in terms of partial amplitudes, (0,6) =DyyfjyYim(4 ¢). This 
expansion uses angular basis functions yy of the outgoing electronic 


0 2 |? 


a, 


Sr number (a.u.) 
o = 


Figure 1 | Photodissociation of diatomic molecules in an optical lattice. 
a, A homonuclear molecule (black circles) producing fragments 

(green circles) with well-controlled speeds forms a Newton sphere. 

The distribution of the fragments on the sphere surface is parameterized 
by a polar angle @ relative to the z axis and an azimuthal angle ¢ relative 

to the x axis in the xy plane. The photodissociating (PD) light propagates 
along +x. b, An experimental image of the fragments corresponds to 

the Newton sphere projected onto the yz plane. This particular image is 
one of many we observed that is highly quantum mechanical in nature 
and distinctly lacks fragments that are emitted along the xz plane. 

The distribution is thus not cylindrically symmetric about the z axis 

and depends on ¢ in addition to 0. a.u., arbitrary units. c, The fragments 
(green ovals) are detected by absorption imaging using a charge-coupled 
device (CCD) camera and a wide light beam from an optical fibre. The 
photodissociating light is coaligned with the lattice axis along x. The 
imaging light is nearly coaligned with x (a small tilt is present for technical 
reasons). A magnetic field can be applied along the z axis. 


1Department of Physics, Columbia University, 538 West 120th Street, New York, New York 10027-5255, USA. Quantum Chemistry Laboratory, Department of Chemistry, University of Warsaw, 
Pasteura 1, 02-093 Warsaw, Poland. +Present address: Faculty of Physics, Ludwig Maximilian University of Munich, Schellingstrasse 4, 80799 Munich, Germany. 


122 | NATURE | VOL 535 | 7 JULY 2016 


© 2016 Macmillan Publishers Limited. All rights reserved 


a 1 b 
= 25 
1S +5P, ay 
= 
PD = 
w] OF 
1s + 1s || 
X(-1, 0) 10 100 1,000 
Internuclear separation (Bohr) 
c é/kg (mk) 
2 0 1 2 3 4 5 6 7 
[4 O Experiment (axial) 5 12 50 
© Experiment (side) 
1 Theory 
ge | 
> % 
ee 
\Bo 
| Pe ° 5 6 
o_ 5 5 
= BSS OF eo. 3 Yso ; 
TL —2-— —& $058 —¢ "5 ; 
0 50 100 150 


Continuum energy, €/h (MHz) 


Figure 2 | Photodissociation to a multichannel continuum. a, Schematic 
for PD of *8Sr in the initial ground state X(v; J;) to an excited continuum 
energy €, which is subsequently expressed in MHz or in mK (via the 
Boltzmann constant kg). b, Potential energy structure (<1 mK) of the 
1§+3P, continuum, showing both of the electronic potentials (0; and 1,,) 
that couple to the ground state via El transitions’. c, The angular 
anisotropy parameter /329 for this process measured by two imaging 
methods (using axial-view and side-view CCD cameras) and calculated 
using a quantum chemistry model. The inset images show fragments 

at three different energies </h labelled in MHz. The images and curves 
indicate a steep change in the angular anisotropy in the 0-2 mK 
continuum energy range. The experimental errors for axial imaging were 
estimated by varying the choice of centre point for the pBasex algorithm 
and averaging the results, and for side imaging from least-squares fitting to 
equation (2) convolved with a blurring function to account for 
experimental imperfections. 


channel, where J and M are the total angular momentum and its 
projection onto the quantum axis, respectively. The intensities for 
separate electronic channels superpose to produce the total intensity 
1(6, @). Cylindrically asymmetric distributions with @ dependence are 
possible if several M states are coherently created because 
Wyau(0, &) =e™?ab74(0, 0). Our measured angular distributions can 
be summarized with the parameterization 


1(0,@) «1+ s 5 Bincos(md)P?"(cos6) (2) 


1=1 m=0 


where P/"(cos@) is an associated Legendre polynomial and | is 
restricted to even values for homonuclear diatomic molecules. The 3), 
coefficients are directly related to the amplitudes fj, but hide some of 
the simplicity that is apparent from using the amplitudes with equation 
(1). Besides their use in photodissociation, fragment angular distribu- 
tions are powerful observables in photoionization experiments! as they 
provide a route to completely measure the ionization matrix element 
amplitudes and phases'*. The internal angular momenta of the frag- 
ments may also carry valuable information». 

To investigate a multichannel electronic continuum at very low dis- 
sociation energies, ¢, we prepared ultracold molecules in the J;=0 
initial state of the least-bound vibrational level vj = —1 (negative v; 
count down from threshold) of the ground potential X and photo- 
dissociated them at the excited '$ +°P, continuum via the electric 
dipole (E1) process illustrated in Fig. 2a with an applied field B=0. 
There are four allowed channels in the excited continuum, which are 
labelled 07, is 07 and 1 © where the letters u/g refer to the inversion 


LETTER 


b é/kg, (mk) 
2-41 0 1 
0.6F— 
es — p=0 
=> 0.5 > 
3 °°ESS = plat 
S Oates 
oO oe 
€osk ° 
5 SF; 
£ oat 
a 
r O& 


Continuum energy, ¢/h (MHz) 


Figure 3 | El-forbidden photodissociation experiment and theory. 

a, Molecules in M;=0 of the long-lived 1, state below the 'S + *P, 
threshold are prepared with a bound-bound (B-B) a pulse and 
fragmented at the gerade ground continuum with PD light. b, M1/E2 
photodissociation produces photofragments for ¢ > 0 (right), and 

as predicted is strongest for p= 0. Solid curves are calculations of 

the total transition strength using a quantum chemistry model. El 
photodissociation to the *P, + +P, continuum also appears for ¢ <0 (left). 
The inset image shows fragments for p=0 and ¢/h= 8 MHz. The strong 
central dot results from spontaneous E1-forbidden photodissociation of 
the molecules into low-energy atoms that are captured by the lattice. 


symmetry of the wave function and the numbers 0/1 refer to the inter- 
nuclear axis projection of the electronic angular momentum. Only 
u-symmetric channels are El-accessible from the ground state. Here, 
the light polarization sets the quantum axis along z and the fragments 
can only have J=1, M=0 quantum numbers because J > 1 for the 
0} and 1, electronic potentials shown in Fig. 2b. As 1,, has an ~30 MHz 
(~1.5 mK) repulsive electronic barrier, we expect the fragment angular 
distribution to evolve in the probed energy range due to barrier 
tunnelling. We observe a steep variation of the single anisotropy param- 
eter needed to describe this process, 329 from equation (2). Two 
methods were used to measure this data: axial-view imaging processed 
with the pBasex algorithm'® and side-view imaging integrated along 
the lattice and fitted to a density profile. Figure 2c shows that both 
methods agree and reveals an evolution of the fragment distribution 
from a parallel dipole (G29 +2 at e/h 5 MHz where h is Planck’s 
constant) to a uniform shell (29 +0 at ¢/h + 12 MHz) and then a per- 
pendicular dipole (G9 —1 at e/h~50 MHz). A quantum chemistry 
model*? was used to calculate the expected anisotropy curve in Fig. 2c 
by connecting the bound and continuum wave functions via Fermi’s 
golden rule to compute the amplitudes fj, showing strong qualitative 
agreement with the data. The theoretical 0,’ and 1, Coriolis-mixed poten- 
tials agree well with high-precision bound-state **Sr, spectroscopy™””, but 
this work is the first test of their predictive power in the continuum. 
E1-forbidden photodissociation is an important effect in atmos- 
pheric physics and must be considered when calculating the total 
absorption cross section for molecular oxygen within the so-called 
Herzberg continuum. Surprisingly, however, neither magnetic dipole 
(M1) nor electric quadrupole (E2) photodissociation has been 
directly observed previously. In most cases E1 is also present, mak- 
ing it challenging to study the weaker M1/E2 processes. However, 
experiments with ultracold Sr, allow measurements of pure M1/ 
E2 photodissociation and a comparison with quantum mechanical 
calculations. Using resonant 1 pulses we prepare metastable mole- 
cules in a J; = 1, M;=0 state of the least-bound vibrational level of 
the subradiant 1, potential that has no E1 coupling to the ground 
state’”, as sketched in Fig. 3a. The frequency of the dissociating light 
was varied as shown in Fig. 3b. Here p =0 (|p| = 1) implies that the 
light polarization has a magnetic field parallel (perpendicular) to the 
quantum axis. The prominent, polarization-independent feature on 
the left (¢ <0) is El photodissociation above the 3p, +3P, threshold, 
whereas the weaker, polarization-dependent feature on the right is 
M1/E2 photodissociation. As the figure shows, the strength of this 


7 JULY 2016 | VOL 535 | NATURE | 123 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


() x 


1,(-1, 3) 


1,(-1, 2) 


OX 


(poe 


Experimentally 
inaccessible 


QW). 


1,(A1, 1) 


1,(-1, 4) 


1,(-1, 2) 


Experimentally 
inaccessible 


Figure 4 | Photodissociation of singly excited (1S + #P,) molecules to the 
ground-state continuum with energies of several millikelvin. Each row 
and column corresponds to molecules prepared in the indicated 

1,(vj Ji) state and M; sublevel. (M;=4 was not accessible experimentally.) 
The upper and lower sections correspond to PD light polarizations |p| = 1 
and 0, respectively, where the PD laser’s electric field is Epp. Within each 
square panel, the experimental image is on the top right, with 


forbidden process tapers off rapidly and is substantial only below 
~1mkK. The inset displays fragments near the peak of the p=0 
spectrum. Although the number of fragments for p =0 is unaffected 
by interference between M1 and E2 pathways, our calculations indicate 
that their angular distributions (Extended Data Fig. 1) are sensitive to 
this rarely observed interference. 

We take advantage of the single-channel spinless ground state of 
88Sr, to explore chemistry in the ultracold regime, obtain a library of 
fragment distributions and test a quasiclassical model of photodisso- 
ciation. We prepare singly excited (1S +*P,) molecules with quantum 
numbers J;, M; and immediately photodissociate them at the '$ +!S 
ground state continuum, in some cases applying B up to 20G to enable 
symmetry-forbidden E1 transitions!®. To control the final value of 
J in the continuum (which quantum statistics requires to be even for 
bosonic ground-state **Sr2) we either obtain a unique J by choosing 
to start from an even J; and taking advantage of selection rules, or, 
if multiple ‘partial waves’ with different J are possible and interfere, 
we choose an € value at which a single J wave strongly dominates, as 


124 | NATURE | VOL 535 | 7 JULY 2016 


a comparable simulation of a projected Newton sphere on the bottom 
right. The full sphere rendition is on the bottom left and the top left shows 
the mapping of the fragment detection probability at each angle onto 

the radial coordinate of a surface. For |p| = 1, matter-wave interference 
occurs if two values of M are produced, leading to strongly ¢-dependent 
patterns. For each case, the degree of agreement with the quasiclassical 
approximation is indicated by a coloured dot, as explained in the text. 


discussed below. To control the final values of M we orient the linear 
polarization of the photodissociating light either parallel (p =0) to the 
quantum axis, for which selection rules ensure M = M,, or perpen- 
dicular (|p| = 1), for which M=M,+ 1. Thus, we are able to engineer 
and image different continua in either pure M states or as their coher- 
ent quantum interference. Disruption from Zeeman shifts is avoided 
because the ground continuum is practically nonmagnetic. 

Figure 4 shows a full range of distributions parameterized by equa- 
tion (1) with either f(0,¢) = Y7,u,(0, ¢) or f (0,6) = VR Y7,m,-1(9, @)+ 
e®/1—R Y),m,+1(0, @). Here the spherical harmonics Yjy = Wy for 
the ground continuum, R and 6 are the relative amplitude and phase 
parameters and J =2 or 4. (At the chosen continuum energies, the 
p= 0 patterns for J;= 1, 3 would be nearly redundant with J;=2, 4 and 
so are omitted.) Quantum mechanical calculations, included for 
comparison, assume that the continuum states are dominated by the 
higher J contribution. Figure 4 suggests the following observations. 
First, the coherent superposition of a pair of M, which occurs for |p| =1 
but not p =0, leads to clean observations of distributions without 


© 2016 Macmillan Publishers Limited. All rights reserved 


cylindrical symmetry, previously unreported for diatomic molecules. 
In particular, multiple cases are shown of a molecule fragmenting into 
up to eight distinct (6, @) regions. Second, the same final states (J=4, 
M= +1) are produced for |p| = 1, M;=0 and J;=4, 3 at the chosen 
continuum energies. Thus we could expect to observe identical frag- 
ment patterns. However, a subtle point is that odd J; and even J; pro- 
duce M=M;+1 probability amplitudes with an opposite relative 
phase. This results in identical ¢-dependent patterns rotated by 90° 
relative to each other. The same mapping of the relative phase onto the 
rotation angle occurs for |p| =1, M;=0 and J;=2, 1. Third, the previ- 
ous point roughly holds for the higher values of M; as well, but non- 
identical populations of M=M;+ 1 are produced due to asymmetrical 
coupling strengths. For example, the matter-wave interference pat- 
terns for (J;, M;) = (4, 2) and (3, 2) are not only rotated relative to each 
other, but have slightly different shapes. 

Over the past few decades a quasiclassical model has been 
advanced to predict the angular distributions for single-photon E1 
photodissociation of diatomic molecules prepared in arbitrary quan- 
tum states®”!°. This approach multiplies the conventional 
distribution*? for molecules prepared in spherically symmetric 
states or ensembles, I(8) o< 1 + 329P$(cosx), by a probability density 
|®;|? for the initial molecular axis orientation, which gives 
1(, @) x |G(0, 6) P [1+ BooP3(cosy)] where y= (6, ¢) is the polar 
angle defined with respect to the orientation of linear polarization of 
the photodissociating light and (0, ¢) are defined by the quantization 
axis, as before. This intuitive model suggests that photodissociation 
probes the ‘shape’ of the initial molecules, as detailed in Extended 
Data Fig. 2. Its validity, however, has been questioned over the years”” 

To indicate the level of agreement with the quasiclassical model, we 
include coloured dots for each pattern in Fig. 4. A green dot indicates 
exact agreement between the quasiclassical and quantum mechanical 
calculations, a yellow dot indicates qualitative agreement that cannot 
be made exact by adjusting G9, an orange dot indicates disagreement 
that can become a qualitative agreement by adjusting (9 and a red dot 
indicates clear disagreement for all 3)—usually because fragments 
are observed where |9;|” has a node. For all cases in Fig. 4 the quasi- 
classical model fails to varying degrees. Although this could be 
expected for the 1, initial states’’, surprisingly even photodissociation 
of the 0;* states (Extended Data Fig. 3) disagrees with the quasiclassical 
model in all cases where more than a single J is possible in the contin- 
uum. This is because only the single-J cases allow the quasiclassical 
assumption of prompt axial recoil to be satisfied at such low continuum 
energies. Furthermore, our experiments demonstrate that initial mol- 
ecules with different shapes (for example, 07 versus 1,,) can produce 
nearly identical distributions, highlighting that the fragment distribu- 
tions are solely determined by the final (continuum) states. 

Ultracold photodissociation readily reveals features of the 
continuum just above the threshold. The ability to freely explore a 
large range of continuum energies, together with strict optical selec- 
tion rules and cleanly prepared quantum states, provides a versatile 
tool to isolate and study individual reaction channels. Whereas 
Fig. 2 explored tunnelling through an electronic barrier, Fig. 5 
shows the case when only rotational barriers are present. Here 
molecules prepared in the 0 (v= —3, J; =3, M;=0) state are photo- 
dissociated with p = 0, resulting in continuum states with M=0 
and J=2, 4. This mixture can be described by equation (1) with 
f (0,6) =~VR Ya0(0, 6) +e* V1 — R Yio(0, ¢). Figure 5a is a plot of 
the branching ratio R and the interference amplitude 2cosé.,/R(1 — R) 
for the 0-15 mK range of continuum energies. The data show a good 
qualitative agreement with quantum chemistry calculations, and reveal 
a predicted but so far unobserved g-wave shape resonance (or quasi- 
bound state) confined by the J= 4 centrifugal barrier. This long-lived 
(~10ns) resonance ~66 MHz above threshold could be used to control 
light-assisted molecule formation rates”!. Shape resonances can also 
be mapped with magnetic Feshbach dissociation of ground-state mol- 
ecules*’-™4, However, photodissociation is more widely applicable to 


LETTER 


i 


Branching ratio -e- 
Interference amplitude = 
a fl L 1 L 1 L 1 L 1 


100 150 200 250 
Continuum energy, €/h (MHz) 


Figure 5 | Energy-dependent photodissociation near a shape resonance. 
a, Molecules prepared in the OF (vj = —3,];=3,M;= 0) state are 
photodissociated at the ground continuum. For p = 0, selection rules 

lead to a single M=0 but a mixture of J=2, 4. The branching ratio and 
interference amplitude of this mixture, as described in the text, evolve 
with energy and reveal a J = 4 (g-wave) shape resonance at ~3 mK. The 
experimental data were analysed with pBasex and errors were estimated 
by varying the effective saturation intensity, used to process the absorption 
images, within its uncertainty. The theoretical curves were calculated 

with a quantum chemistry model. b, Images of fragments labelled by 

their continuum energies ¢/h in MHz that show the evolution with energy. 
The faint anisotropic, energy-independent pattern with roughly the same 
radius as the 62 MHz image is from spontaneous decay into the shape 
resonance. 


300 


molecules with any type of spin structure in any electronic state, and 
allows more control over the quantum numbers. In Fig. 5b an aniso- 
tropic, energy-independent pattern is visible on all images with a 
radius close to that of the 62 MHz image. We have confirmed that this 
signal arises from spontaneous photodissociation of the molecules 
into the g-wave shape resonance (Extended Data Fig. 4). 

This work explores light-induced molecular fragmentation in the 
fully quantum regime. Quasiclassical descriptions are not applicable 
and our observations are dominated by coherent superpositions of 
matter waves originating from monoenergetic continuum states with 
different quantum numbers. The results agree with a state-of-the-art 
quantum chemistry model®”, but challenge the theory to describe 
more complicated phenomena. For example, preliminary observations 
of photodissociation to the doubly excited continuum (as in Fig. 3b) 
indicate rich structure near the threshold. This continuum is not well 
understood, while interactions near the 3P, +P, threshold play a key 
role in recent proposals and experiments in ultracold many-body sci- 
ence’°. Other excited continua with even longer lifetimes (for example, 
the subradiant 1, and 0+ manifolds) exist for Sr) and similar molecules 
and should enable the exploration of entangled continuum states. 
Photodissociation can shed light on the ultracold chemistry of a rich 
array of molecular states, as well as on new reaction mechanisms—as 
was shown here with M1/E2 photodissociation. With improved con- 
trol of the imaging and of the optical lattice effects, experiments can 
get even closer to the threshold. We expect to reach nanokelvin frag- 
ment energies in the lattice, leading to high-precision measurements 
of binding energies for tests of fundamental physics and molecular 
quantum electrodynamics*®”’. Ultralow fragment energies can also 
aid in the creation of novel ultracold atomic gases”*. A promising 
future direction would be to enhance the quantum control achieved 


7 JULY 2016 | VOL 535 | NATURE | 125 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


here by manipulating the final continuum states with external 
fields’??°. We have shown the extreme sensitivity of weakly bound 
molecules to small magnetic fields!, and the same principle applies 
just above threshold. This external control over ultracold chemistry 
should allow the study and manipulation of new reaction pathways. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 4 January; accepted 26 April 2016. 


1. Jones, K. M., Tiesinga, E., Lett, P. D. & Julienne, P. S. Ultracold photoassociation 
spectroscopy: long-range molecules and atomic scattering. Rev. Mod. Phys. 78, 
483-535 (2006). 

2. Chin, C., Grimm, R., Julienne, P. & Tiesinga, E. Feshbach resonances in ultracold 
gases. Rev. Mod. Phys. 82, 1225-1286 (2010). 

3. Ospelkaus, S. et a/. Quantum-state controlled chemical reactions of ultracold 
potassium-rubidium molecules. Science 327, 853-857 (2010). 

4. Zare, R. N. & Herschbach, D. R. Doppler line shape of atomic fluorescence 
excited by molecular photodissociation. Proc. IEEE 51, 173-182 (1963). 

5. Zare, R. N. Photoejection dynamics. Mol. Photochem. 4, 1-37 (1972). 

6. Choi, S. E. & Bernstein, R. B. Theory of oriented symmetric-top molecule 
beams: precession, degree of orientation, and photofragmentation of 
rotationally state-selected molecules. J. Chem. Phys. 85, 150-161 (1986). 

7. Zare, R. N. Photofragment angular distributions from oriented symmetric-top 
precursor molecules. Chem. Phys. Lett. 156, 1-6 (1989). 

8. Skomorowski, W., Pawtowski, F., Koch, C. P. & Moszynski, R. Rovibrational 
dynamics of the strontium molecule in the A !7f, c 3I],, and a 3x; manifold 
from state-of-the-art ab initio calculations. J. Chem. Phys. 136, 194306 
(2012). 

9. Borkowski, M. et al. Mass scaling and nonadiabatic effects in photoassociation 
spectroscopy of ultracold strontium atoms. Phys. Rev. A 90, 032713 (2014). 

10. Grangier, P., Aspect, A. & Vigue, J. Quantum interference effect for two atoms 
radiating a single photon. Phys. Rev. Lett. 54, 418-421 (1985). 

11. Kheruntsyan, K. V., Olsen, M. K. & Drummond, P. D. Einstein—-Podolsky—-Rosen 
correlations via dissociation of a molecular Bose-Einstein condensate. 

Phys. Rev. Lett. 95, 150405 (2005). 

12. McGuyer, B. H. et al. Precise study of asymptotic physics with subradiant 
ultracold molecules. Nature Phys. 11, 32-36 (2015). 

13. Reid, K. L. Photoelectron angular distributions. Annu. Rev. Phys. Chem. 54, 
397-424 (2003). 

14. Hockett, P., Wollenhaupt, M., Lux, C. & Baumert, T. Complete photoionization 
experiments via ultrafast coherent control with polarization multiplexing. 
Phys. Rev. Lett. 112, 223001 (2014). 

15. Rakitzis, T. P, Kandel, S. A., Alexander, A. J., Kim, Z. H. & Zare, R. N. 
Photofragment helicity caused by matter—-wave interference from multiple 
dissociative states. Science 281, 1346-1349 (1998). 

16. Garcia, G.A., Nahon, L. & Powis, |. Two-dimensional charged particle image 
inversion using a polar basis function expansion. Rev. Sci. Instrum. 75, 
4989-4996 (2004). 


126 | NATURE | VOL 535 | 7 JULY 2016 


17. McGuyer, B. H. et a/. High-precision spectroscopy of ultracold molecules in an 
optical lattice. New J. Phys. 17, 055004 (2015). 

18. McGuyer, B. H. et al. Control of optical transitions with magnetic fields in 
weakly bound molecules. Phys. Rev. Lett. 115, 053001 (2015). 

19. Beswick, J. A. & Zare, R. N. On the quantum and quasiclassical angular 
distributions of photofragments. J. Chem. Phys. 129, 164315 (2008). 

20. Seideman, T. The analysis of magnetic-state-selected angular distributions: 
a quantum mechanical form and an asymptotic approximation. Chem. 
Phys. Lett. 253, 279-285 (1996). 

21. Gonzdlez-Férez, R. & Koch, C. P. Enhancing photoassociation rates by 
nonresonant-light control of shape resonances. Phys. Rev. A 86, 063420 (2012). 

22. Volz, T. et al. Feshbach spectroscopy of a shape resonance. Phys. Rev. A 72, 
010704(R) (2005). 

23. Mark, M. et al. Stlickelberg interferometry with ultracold molecules. Phys. Rev. 
Lett. 99, 113201 (2007). 

24. Knoop, S. et al. Metastable Feshbach molecules in high rotational states. 
Phys. Rev. Lett. 100, 083002 (2008). 

25. Zhang, X. et al. Spectroscopic observation of SU(N)-symmetric interactions in 
Sr orbital magnetism. Science 345, 1467-1473 (2014). 

26. Bartenstein, M. et al. Precise determination of ®Li cold collision parameters by 
radio-frequency spectroscopy on weakly bound molecules. Phys. Rev. Lett. 
94, 103201 (2005). 

27. Salumbides, E. J. et al. Bounds on fifth forces from precision measurements 
on molecules. Phys. Rev. D 87, 112008 (2013). 

28. Lane, |. C. Production of ultracold hydrogen and deuterium via Doppler-cooled 
Feshbach molecules. Phys. Rev. A 92, 022511 (2015). 

29. Lemeshko, M., Krems, R. V., Doyle, J. M. & Kais, S. Manipulation of molecules 
with electromagnetic fields. Mol. Phys. 111, 1648-1682 (2013). 

30. Stapelfeldt, H. & Seideman, T. Colloquium: Aligning molecules with strong 
laser pulses. Rev. Mod. Phys. 75, 543-557 (2003). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements We gratefully acknowledge ONR grant NOOO-14-14-1-0802, 
NIST award 60NANB13D163, and NSF grant PHY-1349725 for partial support 
of this work, and thank A. T. Grier, G. Z. lwata and M. G. Tarallo for discussions. 
R.M. acknowledges the Foundation for Polish Science for support through the 
MISTRZ programme. 


Author Contributions M.M., B.H.M., F.A., C.-H.L. and T.Z. designed the 
experiments, carried out the measurements and interpreted the data. 

|.M. and R.M. carried out the calculations and interpreted the data. All authors 
contributed to the manuscript. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of the paper. 
Correspondence and requests for materials should be addressed to T.Z. 
(tz@phys.columbia.edu). 


Reviewer Information Nature thanks D. Chandler and the other anonymous 
reviewer(s) for their contribution to the peer review of this work. 


© 2016 Macmillan Publishers Limited. All rights reserved 


METHODS 


Experimental details. After laser cooling a gas of atomic Sr in a 1D optical lattice, 
molecules were created via photoassociation to the 0*(v = —4, J = 1) excited state 
(binding energy 1,084 MHz) followed by well-directed spontaneous emission to 
the X! xa(v = ~—1) ground states with J=0 or 2 (binding energies of 137 and 
67 MHz, respectively)!”!, Any remaining atoms were removed with a pulse of 
imaging light. The molecular sample trapped in the lattice is about 20j1m in 
diameter and 2001m long. To prepare metastable 1,(vj= —1, Jj= 1) excited states 
(binding energy 19 MHz, lifetime ~5 ms), we used a lattice wavelength of ~910nm 
to enable resonant 689 nm 7 pulses to transfer the population from X(v=—1, J=0) 
to this state before photodissociation’”. For our experimental conditions, this 
transfer was ~40% efficient. To prepare shorter-lived 0; or 1, excited states, we 
used a 689 nm light pulse to drive a resonant bound-bound transition from either 
the J=0 or 2 ground state to the desired state during photodissociation. In both 
cases, we used the polarization of this light and excited-state Zeeman shifts!?1832 
to select M;. For reference, the binding energies for the 1,,(v;=—1, J;) excited le 
are 353 MHz for J’ = 1; 287 MHz for J’ =2; 171 MHz for J’ =3 and 56 MHz cig 

for the 0;(v; = —3, J; = 3) state, the binding energy is 132 MHz and for 0/(—4, ‘ 
it is 1,084 MHz. The 'S+'S and 15+ 3P, thresholds may be spectroscopically 
located with kilohertz precision using the lineshape model of ref. 17. 

The photodissociating light propagates along the tight-confinement x axis of 
the optical lattice (Gaussian waist ~40 1m), and is linearly polarized along either 
the y axis or the z axis. Except for Fig. 2, for which the net magnetic field is nearly 
zero, a field of a few to a few tens of gauss is applied along the z axis to fix a quanti- 
zation axis for excited bound states. The ground bound and continuum states are 
insensitive to this field, so to avoid mixed-quantization effects from tensor light 
shifts'® the optical lattice was linearly polarized along the z axis. We confirmed that 
our results are unaffected by the small lattice trap depth (typically 0.6-0.8 MHz). 
A full description of the laboratory-frame spherical tensor components of the 
fields driving the photodissociation transitions is available in the Supplementary 
Information. 

After the photodissociating light pulse, the fragments were allowed to expand 
kinetically for several hundred microseconds before their positions were recorded 
with standard absorption imaging*’. This expansion time is needed to mitigate 
blurring due to the finite pulse width and limited imaging resolution, but has the 
cost of diluting the signal over a larger area, which makes imaging artefacts more 
problematic. Therefore, we adjusted this expansion time as needed to optimize the 
signal-to-noise ratio and angular resolution. 

Most absorption images were taken with imaging light aligned nearly along the 
x axis, projecting the fragment positions into the yz plane. Several hundred absorp- 
tion images were averaged to produce a final record of the fragment positions. To 
remove imaging artefacts and incidental absorption from unwanted atoms, the 
experimental sequence was alternated so that every other image contained none 
of the desired fragments, but everything else. The final image was then computed 
as the averaged difference between these interlaced ‘with fragment’ and ‘without 
fragment’ images. For Fig. 2 insets and side-view data, we also used an optical 
pulse to deplete the ground-state population with J= 2 before photodissociating 
the J;=0 states. 

Forbidden photodissociation angular distributions. A comparison of experi- 
mental images of fragment distributions and calculations for the M1/E2 photo- 
dissociation of Fig. 3 is presented in Extended Data Fig. 1. Note that a large light 
intensity was required to drive the forbidden photodissociation process sufficiently 
rapidly to observe these angular distributions. Besides power broadening the line 
shapes in Fig. 3b, this high intensity may have affected the measured fragment 
distributions in Extended Data Fig. 1. 
Quasiclassical model. In the photodissociation literature there is a well-known 
quasiclassical model describing the angular distribution of fragments produced 
by the photodissociation of aligned molecules 

Iyc(9, @) = |i(9, 6)? (1 + BroP3(cosx)] (3) 
where the angles are defined in the main text. For homonuclear diatomic 
molecules in the Born—-Oppenheimer approximation, the probability density for 
the internuclear axis orientation of an initial state with quantum numbers J;, M; 
and |{2\| is given by Wigner-D functions as 
(Dhun 


(0, 6, 0) |? + [Ds 


12(0,6 =o)? =D -0(0.8.9)P) (4) 


where 2 is the internuclear projection of the electronic angular momentum. 
We observe disagreement with the quasiclassical model in the majority of cases. 


At first glance this is surprising because, theoretically, the quasiclassical model 
has been shown to be either equivalent or a good approximation to the quantum 


LETTER 


mechanical result for most cases of one-photon E1 photodissociation of a diatomic 
molecule with prompt axial recoil!’. However, our measurements are performed 
at very low continuum energies to reach the ultracold chemistry regime, and thus 
may violate the assumption of axial recoil°. Additionally, ref. 19 predicted that the 
quasiclassical model should fail for the special case of ‘perpendicular transitions 
(|AQ| = 1) with initial states that are a superposition of (2; states differing by £2. 
This special case includes our measurements of 1, initial states in Fig. 4, and our 
observations support this prediction. 

Extended Data Fig. 2 compares the quasiclassical model with both quantum 
mechanical predictions and experimental images for several cases. For each, 
the construction of the quasiclassical prediction is outlined. As in Fig. 4, we use 
coloured dots to indicate the level of agreement between the two predictions. To 
determine this agreement, the quasiclassical model neglected the nonadiabatic 
Coriolis mixing of |2;| (ref. 32). There is also some ambiguity in choosing a value 
of (29 to use with the quasiclassical model. Conventionally, (39 should be equal to 
2 for parallel transitions with A(QQ=0 and to —1 for perpendicular transitions with 
|AQ| = 1. In cases of persistent disagreement, we varied (9 as a free parameter 
within the physically allowed range of [—2, 1]. Such a variation has been consid- 
ered previously as an effect of the breakdown of the axial-recoil approximation**. 

We do observe three cases of exact agreement (indicated by green dots in 
Extended Data Fig. 3), two of which are highlighted in Extended Data Fig. 2. The 
reason the quasiclassical model gives exact results is that selection rules only allow 
a single J in these cases, making the axial-recoil approximation no longer necessary. 
Specifically, these cases correspond to 0; initial states with odd J; for either |Mj;| = J; 
with p=0 or J;=1 and Mj=0 with |p| = 1, for which the angular distribution is 
energy independent. Agreement occurred here without needing to adjust (29. 

In Fig. 2, p=0 and the initial state J; = M;=0 is spherically symmetric, so the 
angular distribution is parameterized only by (29. Thus, the quasiclassical model 
can always be adjusted to agree at any continuum energy. 

Photodissociation of 0; states. Single-photon E1 photodissociation of 0 excited 
states to the ground continuum is shown in Extended Data Fig. 3, in analogy with 
Fig. 4 for 1, states. In Fig. 4 and Extended Data Figs 2-4, the sign of M; does not 
affect the results, and our experiments used M; > 0 for some of the data sets and 
M;<0 for others. To avoid confusion we did not label the figures with |Mj;|, which 
suggests a superposition of M;, but instead chose M; to be positive in the figures. 
Spontaneous photodissociation. Extended Data Fig. 4a contains images of the 
fragments following spontaneous decay of the excited state 0 (v; = —3, J; = 3, Mj) 
to the ground continuum. As we selectively populate individual Mj sublevels, the 
measured distributions are anisotropic. They are well described by the incoherent 
superposition 


4 1 3) 


1(0) © 3 |Yam(0, 6 = 0)? —M M-—M, M, ©) 
M 


Here, J is restricted to 4 because the strongest decay is to the J=4 shape resonance 
in the ground continuum. If all Mj were equally populated, which would add a sum 
over M; to equation (5), then the distribution would be isotropic. 

The shape resonance aids the measurement of the angular distributions because 
it favours a narrow range of continuum energies. Extended Data Figure 4b contains 
the results of pBasex analysis of the inset image and highlights how the radial 
distribution of the atomic fragments is clustered around 66 MHz, revealing an 
~10ns shape resonance lifetime. Extended Data Figure 4c shows that the angular 
distribution from this analysis matches expectations from equation (5). 
Absorption images in figures. Supplementary Tables 1-3 list the parameters used 
to generate the theoretical images shown in Fig. 4 and Extended Data Figs 1-3. 
To display theoretical results as simulated absorption images, the intensities are 
projected into the yz plane by integrating over the x direction. To approximate the 
blurring present in experimental images from limited optical resolution and light 
pulse durations, the image is convolved with a Gaussian distribution 


nowaa fre 


where Ro is the mean radius, o is the standard deviation, 9=cos! (- | |x 24 y 24 22 ) 
and 6= sin" "(y/[x?+y?) . The fractional blur was o/Ro = 0.05 except for 


Extended Data Fig. 4, where 0/Ro=0.2. 

The same colouring scheme (Matlab colourmap jet) is used in all experimental 
and theoretical absorption images, up to differences between CMYK and RGB 
colour mode presentation. Each image was linearly rescaled to fit the finite range 
[0, 1] of this scheme. To ensure that the same colour corresponds to zero absorption 
in all images, despite the presence of noise and imaging artefacts, the experimen- 
tal images are scaled to have an average value of 0.25 in zero-absorption regions 
and a maximum value of 1. Likewise, the theoretical images are scaled to have 


[Payee Re) 1020719, 4) dy (6) 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a minimum value of 0.25 but a maximum value of 0.85 instead of 1, to be more 
visually similar to experimental images. 

The field of view differs between experimental images because of cropping for 
presentation, and falls in the range of 0.1-0.9 mm on each side. For a given image, 
the field of view may be accurately determined by calculating the maximum diam- 
eter D of the photodissociation products as D = Cr.{(€ — U) /h. Here, the kinetic 
expansion time 7 was 0.3 ms for Fig. 1, 0.8 ms for Fig. 2, 0.6 ms for Fig. 3, 
0.3-0.4 ms for Fig. 4, 0.39 ms for Fig. 5, 0.6 ms for Extended Data Fig. 1, 0.3-0.4ms 
for Extended Data Fig. 3 and 0.1 ms for Extended Data Fig. 4. The dissociation 
energies € not labelled in insets are listed in Supplementary Tables 1-3. From 


conservation of energy, the parameter C = 2./h/ms, © 1.348 x 10-4 ms~!/2and 
the lattice depth U must be included as a small offset!”°°. For Fig. 1b, for example, 
this gives D=0.34mm. For theoretical images, D was set to 80% of the image 
width. 

Extracting angular distribution parameters. For angular distributions that 
are cylindrically symmetric (depend only on @), the polar basis set expansion 
(pBasex) algorithm" can extract the 3D distribution from 2D projections such as 
absorption images by fitting the data with the Abel transform of a weighted sum 
of the Legendre polynomials. We used the software implementation of the pBasex 
algorithm in ref. 36 to analyse the images in Figs 2 and 5 and Extended Data 
Fig. 4. For low signal-to-noise images, we found that the extracted distribution 
is artificially skewed towards spherical symmetry”. To eliminate this systematic 
error, we performed pBasex inversion on a background image made from the set 
of without-fragment images that is processed to remove imaging artefacts and 
rescaled so that the average value equals that of the background regions in the 
final image. The final distribution is then the difference between those extracted 
for the original image and for the background image. The parameters (29 of Fig. 2 
and R and 6 of Fig. 5 were determined from least-squares fitting of the number of 
fragments versus @ in the final distribution. 

In some cases, such as with the ¢/h = 32 MHz inset of Fig. 5b, experimen- 
tal issues may lead to images with deviations from the expected cylindrical 
symmetry. This may occur, for example, from imperfect control of the photodis- 
sociating light polarization, which may introduce a ‘skewness in the distribution. 
Apparent deviations from perfect cylindrical symmetry may also have occurred 


because of absorption imaging error induced by imperfectly correcting for the 
atomic saturation, which is especially important when the imaging beam exhibits 
substantial variations across its spatial profile (as was the case for our experiment). 
In such cases, we proceeded with pBasex analysis but included an estimate of the 
resulting bias when determining error bars. 

For Fig. 2, further analysis was performed by integrating 2D projections along 
y to convert the images to 1D curves along z. This allows parameters such as (339 to 
be directly extracted by fitting the 1D curve with the expected angular distribution, 
similar to Extended Data Fig. 4c. Although this analysis can be performed with 
the axial-view images, for Fig. 2 we did this through separate experiments with 
images taken along the y axis, which had the benefits of a reduced optical depth and 
a smoother intensity profile of the imaging beam. These side-view images are 2D 
projections of the photofragment position onto the xz plane, and are complicated 
by the distribution of occupied sites in the optical lattice. 
Calculation and parameterization of angular distributions. Supplementary 
Information details the calculation and parameterization of photodissociation 
angular distributions used in this work. Supplementary Tables 1-3 list the param- 
eters for all of the theoretical images as well as experimental continuum energies. 


31. Reinaudi, G., Osborn, C. B., McDonald, M., Kotochigova, S. & Zelevinsky, T. 
Optical production of stable ultracold 8°Srz molecules. Phys. Rev. Lett. 109, 
115303 (2012). 

32. McGuyer, B. H. et al. Nonadiabatic effects in ultracold molecules via anomalous 
linear and quadratic Zeeman shifts. Phys. Rev. Lett. 111, 243003 (2013). 

33. Reinaudi, G., Lahaye, T., Wang, Z. & Guéry-Odelin, D. Strong saturation 
absorption imaging of dense clouds of ultracold atoms. Opt. Lett. 32, 
3143-3145 (2007). 

34. Wrede, E., Wouters, E. R., Beckert, M., Dixon, R. N. & Ashfold, M. N. R. 
Quasiclassical and quantum mechanical modeling of the breakdown of the 
axial recoil approximation observed in the near threshold photolysis of IBr 
and Bro. J. Chem. Phys. 116, 6064-6071 (2002). 

35. Apfelbeck, F. Photodissociation Dynamics of Ultracold Strontium Dimers. MSc 
thesis, Ludwig Maximilian University of Munich (2015). 

36. O'Keeffe, P. et a/. A photoelectron velocity map imaging spectrometer for 
experiments combining synchrotron and laser radiations. Rev. Sci. Instrum. 
82, 033109 (2011). 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Figure 1 | Angular distributions for the M1/E2 contrast, the strong centre dot from spontaneous decay, as seen in Fig. 3b, 
photodissociation of 1,(v; = —1, J;=1, M;=0) state with p = 0 to the was removed before processing and is covered by a box. The theoretical 
ground continuum. Images are arranged as in Fig. 4. The experimental images are calculated using a quantum chemistry model. 


images are labelled by the continuum energy ¢/h in MHz. To improve 


© 2016 Macmillan Publishers Limited. All rights reserved 


RESEARCH (Iu 
| (0, 6)|7 


Quasiclassical 


Quasiclassical 


07)(—4, 1, 0) 


Quasiclassical 


io 
so) 
D 
7) 
& 
2 
D 
® 
5 
Co 


07(—4, 1 5 1) 


Quasiclassical 


Quasiclassical 


1y(—1, 3, 0) 


AQ] = 1, |p| =1 


Extended Data Figure 2 | Comparison of quasiclassical and quantum for | AQ2|=1. (The quantum mechanical predictions slightly differ from 
mechanical (QM) theory with experimental (Exp) images for selected those displayed in Fig. 4 and Extended Data Fig. 3 because they are the full 
cases from Fig. 4 and Extended Data Fig. 3. The quasiclassical predictions quantum mechanical calculations given in Supplementary Tables 2 and 3.) 
follow from equations (3) and (4) assuming 329 =2 for AY2=0 and —1 As before, coloured dots indicate the level of quasiclassical agreement. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Figure 3 | Photodissociation of molecules near the the 1, initial states, contrary to the quasiclassical picture. As before, 
'§ +3P, threshold to the ground-state continuum. In contrast to Fig. 4, compatibility with the quasiclassical approximation is indicated 
here the initial states are 07 with (v;, J;) =(—4, 1) or (—3, 3) as indicated. by the coloured dots. 


These initial states lead to nearly identical distributions as those with 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


= 
~~ 


Experiment 


simulation 


© 
oO 


Signal (arb. units) 


_0. 0 
0-59-30 40 60 80 100 120 140 0 7/4 7/2 3/4 1 


Kinetic energy (MHz) @ (radians) 


Extended Data Figure 4 | Spontaneous photodissociation of molecules algorithm. The extracted fragment radial distribution shows a focusing 
prepared in 0*(v;= —3, J;= 3, M;) states. a, Absorption images of angular around a certain kinetic energy, which was determined by fitting with a 
distributions versus M;. Theoretical simulations using equation (5) are Gaussian (red curve). Correcting for an offset due to the lattice depth*’, 
shown underneath. A short expansion time was used to increase visibility. this energy corresponds to a shape resonance with a binding energy of 

b, For quantitative analysis, another image (inset) of the M;= 0 case was —66 +3 MHz. c, The extracted fragment angular distribution qualitatively 
taken with a longer expansion time and analysed with the pBasex matches the calculation (red curve) of equation (5). 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


doi:10.1038/nature17974 


Single-molecule strong coupling at room 
temperature in plasmonic nanocavities 


Rohit Chikkaraddy!, Bart de Nijs!, Felix Benz!, Steven J. Barrow’, Oren A. Scherman’, Edina Rosta?, Angela Demetriadou’, 


Peter Fox*, Ortwin Hess* & Jeremy J. Baumberg! 


Photon emitters placed in an optical cavity experience an 
environment that changes how they are coupled to the surrounding 
light field. In the weak-coupling regime, the extraction of light from 
the emitter is enhanced. But more profound effects emerge when 
single-emitter strong coupling occurs: mixed states are produced that 
are part light, part matter’, forming building blocks for quantum 
information systems and for ultralow-power switches and lasers**. 
Such cavity quantum electrodynamics has until now been the 
preserve of low temperatures and complicated fabrication methods, 
compromising its use”. Here, by scaling the cavity volume to 
less than 40 cubic nanometres and using host-guest chemistry to 
align one to ten protectively isolated methylene-blue molecules, 
we reach the strong-coupling regime at room temperature and in 
ambient conditions. Dispersion curves from more than 50 such 
plasmonic nanocavities display characteristic light-matter mixing, 
with Rabi frequencies of 300 millielectronvolts for ten methylene- 
blue molecules, decreasing to 90 millielectronvolts for single 
molecules—matching quantitative models. Statistical analysis of 
vibrational spectroscopy time series and dark-field scattering spectra 
provides evidence of single-molecule strong coupling. This dressing 
of molecules with light can modify photochemistry, opening up the 
exploration of complex natural processes such as photosynthesis? 
and the possibility of manipulating chemical bonds!. 

Creating strongly coupled mixed states from visible light and 
individual emitters is severely compromised by the hundred-fold 
difference in their spatial localization. To overcome this, high-quality 
cavities are used to boost interaction times and enhance coupling 
strengths. However, in larger cavities the longer round trip for photons 
to return to the same emitter decreases the coupling, which scales as 
gx 1/-/V, where Vis the effective cavity volume and gis the coupling 
energy. This coupling has to exceed both the cavity loss rate, «, and the 
emitter scattering rate, y, in order for energy to cycle back and forth 
between matter and light components, requiring 2g >, « (ref. 11). For 
cryogenic emitters”® (laser-cooled atoms, vacancies in diamond, or 
semiconductor quantum dots), the suppressed emitter scattering allows 
large cavities (with a high quality factor, Q, which is proportional to «~') 
to reach strong coupling. Severe technical challenges, however, restrict 
the energy, bandwidth, size and complexity of devices. Progress towards 
room-temperature devices has been limited by the unavoidable increase 
in emitter scattering, and the difficulty of reducing the volume of dielectric- 
based microcavities—at wavelength and refractive index n—below 
V\=(A/n)*. At room temperature, typical scattering rates for embedded 
dipoles are y~ kgT, implying that suitable Q < 100, which thus requires 
cavities of less than 10~°V) (Fig. 1a, dark green shaded area). 

Improved confinement uses surface plasmons (Fig. 1a), combining 
oscillations of free electrons in metals with electromagnetic waves”. 
Structured metal films can couple molecular aggregates of high oscil- 
lator strength, but far too many molecules are involved for quantum 


optics. Recent studies have reached 1,000 molecules!?->— still far above 
the one to ten molecules that are needed to access quantum effects at 
room temperature. 

To create such small nanocavities and orient single molecules 
precisely within them, we use bottom-up nanoassembly. Although field 
volumes of individual plasmonic nanostructures are too large”, smaller 
volumes and stronger field enhancements occur within subnanometre 
gaps between paired plasmonic nanoparticles. We use the promising 
nanoparticle-on-mirror (NPoM) geometry’, placing emitters in the 
gap between nanoparticles and a mirror underneath (Fig. 1b). This 
gap is accurately controlled to a subnanometre scale using molecular 
spacers, is easily made by depositing monodisperse metal nanoparticles 
onto a metal film, and is scalable, repeatable and straightforward to 
characterize’”'®. Specifically, we use gold nanoparticles of 40-nm 
diameter on a 70-nm-thick gold film, separated by a 0.9-nm molecular 
spacer (see below). The intense interaction between each nanoparticle 
and its image forms a dimer-like construct with field enhancements of 
~10°, and an ultralow mode volume. The coupled plasmonic dipolar 
mode is localized in the gap (Fig. 1b), with the electric field oriented 
vertically (along the z direction). The resonant wavelength is deter- 
mined by the nanoparticle size and gap thickness, and can thus be 
tuned from 600 nm to 1,200 nm (ref. 17). 

Several factors are essential in positioning a quantum emitter inside 
these small gaps. One is to prevent molecular aggregation, which occurs 
commonly. Another is to ensure that the transition dipole moment, 
is perfectly aligned with the gap plasmon (along the electric field). We 
use acommon dye molecule, methylene blue, with a molecular transi- 
tion at 665 nm, to which our plasmons are tuned. To avoid aggregation 
of the dye molecules and to assemble them in the proper orientation, 
we use the host-guest chemistry of macrocyclic cucurbit[n]uril mol- 
ecules. These are pumpkin-shaped molecules with varying hollow 
hydrophobic internal volumes, determined by the number of units in 
the ring (n), in which guest molecules can sit (Supplementary Fig. 1). 
Cucurbit[7]uril is water-soluble and can accommodate only one 
methylene-blue molecule inside. Encapsulation of methylene blue 
inside cucurbit[7]uril is confirmed by absorption spectroscopy 
(Fig. 2a): methylene-blue dimers (shown by the small ‘shoulder’ peak 
at 625nm on the red curve) disappear on mixing low methylene-blue 
concentrations with cucurbit[7]uril (in a 1:10 molar ratio) (Fig. 2a, blue 
curve). Control experiments with the smaller cucurbit[5]uril molecules 
(into which methylene blue cannot fit) do not remove this shoulder 
peak (Fig. 2a, dashed line), ruling out parasitic binding. Placing single 
methylene-blue molecules into cucurbit[7]uril thus avoids any aggrega- 
tion. Carbonyl portals at either end of the 0.9-nm-high cucurbit[n]uril 
molecules bind them with their rims flat onto the gold surface (Fig. 2b). 
When a monolayer of cucurbit[7]uril is first deposited on the gold 
mirror and suitably filled with methylene-blue molecules, gold 
nanoparticles can bind on top to form the desired filled nanocavity 


1NanoPhotonics Centre, Cavendish Laboratory, University of Cambridge, Cambridge CB3 OHE, UK. 2Melville Laboratory for Polymer Synthesis, Department of Chemistry, University of Cambridge, 
Lensfield Road, Cambridge CB2 1EW, UK. 3Department of Chemistry, King’s College London, London SE1 1DB, UK. “Blackett Laboratory, Department of Physics, Prince Consort Road, 


Imperial College, London SW7 2AZ, UK. 


00 MONTH 2016 | VOL 000 | NATURE | 1 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 
Weak coupling 
10° ’ 

Dielectric 
> 10° aa - sree Na I re oe 
Ss L- Plasmonic y 
o 
= 
=I 
S 10°F 

Strong coupling 
106 


104 108 
Quality factor, Q 


10? 


Figure 1 | Comparing single-molecule optical cavities. a, The quality 
factor, Q, of a nanocavity is plotted against its effective volume, V/V 
(scaled to V\ = (A/n)?), showing strong-coupling (green arrow), room- 
temperature (blue arrow), and plasmonic (orange arrow) regimes for 
single emitters. The icons show realizations of each type of nanocavity: 
from right, whispering gallery spheres (used as microresonators in 

filters, sensors and lasers), microdisks, photonic crystals (with possible 
applications in optical computing), micropillars (used in high-throughput 


(Fig. 2b and Supplementary Information), with the methylene-blue 
molecule aligned vertically in the gap!’. Previous studies!” with 
empty cucurbit[m]urils show that the gap is 0.9 nm, with a refractive 
index of 1.4. 

Dark-field scattering spectra from individual NPoMs show the effect 
of aligning the emitter in different orientations (Fig. 3a). With tm par- 
allel to the mirror (top; without cucurbit[n]urils the methylene blue lies 
flat on the metal surface), the resonant scattering plasmonic peak (wp) 
is identical to that of NPoMs without any emitters (wo). But with pm 
perpendicular to the mirror (bottom), the spectra show two split peaks 
(w4 and w_) resulting from the strong interaction between emitters 
and plasmon. We contrast three types of samples. Without dye (Fig. 3b, 
top), a consistent gap plasmon (wp) at 660 + 10 nm is seen. Small fluc- 
tuations in peak wavelength are associated with +5-nm variations in 
nanoparticle size (Supplementary Fig. 2). When this NPoM is partially 
filled with methylene blue inside the cucurbit[7]Juril, peaks at 610 nm 
and 750 nm are seen either side of the absorption peak of methylene 
blue at wy (Fig. 3b, bottom), corresponding to the formation of hybrid 
plasmon-exciton (‘plexciton’) branches, w+ = wy + ¢/2. This yields a 
Rabi frequency of g=380 meV, confirmed by full three-dimensional 
finite-difference time-domain (FDTD) simulations (Supplementary 
Fig. 3). While some studies'*'4 have shown significant variations in 
w4, we obtain highly consistent results, with no spectral wandering 
observed on individual NPoMs. With dye molecules perpendicular to 
the plasmon field (without cucurbit[n]urils), only a gap plasmon is seen 
(Supplementary Fig. 4c). Methylene-blue molecules self-assembling on 
gold orient flat to the surface, owing to 7-stacking interactions between 
the conjugated phenyl rings and the metal film”. Our study thus shows 
how molecular scaffolding is essential to yield molecular coupling to 
the gap plasmon. 

To map the dispersion curve, we combine scattering spectra from 
differently sized nanoparticles, plotted according to their detuning 
from the absorption (‘exciton’) resonance. Simulations of nanoparticles 
of 40-60 nm in diameter (Supplementary Fig. 5) show gap plasmons 
tuning across the exciton. A simple coupled-oscillator model matches 
the quantum mechanical Jaynes-Cummings picture’: 


1 l pow 
wW4= i A ea e+ 67 


with plasmon and exciton resonance energies wp and wo, and detuning 
energies of 6 =wp — wy. Extracting w+ from the scattering spectra allows 
Wp to be calculated (knowing wo, which does not show any spectral 


2 | NATURE | VOL 000 | 00 MONTH 2016 


108 401° 
screening), and nanoparticle-on-mirror geometry (NPoM, used here). 
Purcell factors (P) show emission-rate enhancements. b, Diagram of a 
NPoM. The blue arrow in the gap between the nanoparticle and the mirror 
locates the transition dipole moment of the emitter. The inset above shows 
the simulated near-field of the coupled gap plasmon in the dashed box, 
with maximum electric field enhancement of about 400, oriented vertically 
(in the z direction). 


wandering). This fitting reveals typical anticrossing (mixing) behav- 
iour (Fig. 3c), with g=305 + 8 meV at 6=0. We find 2g/71 ~ 5, well 
into the strong coupling regime. A key figure of merit is the Purcell 
factor, P= Q/V, which characterizes different cavity systems (Fig. la). 
For our plasmonic nanocavities, P= 3.5 x 10° (Supplementary Fig. 6); 
this is over an order of magnitude larger than the Purcell factors of 
state-of-the-art photonic crystal cavities’, which have reached 10°, 
while state-of-the-art planar micropillars”!” attain Purcell factors of 
3 x 10°. The ultralow cavity volume arises here because of the very 
large field confinement in such nanometre-sized gaps (Supplementary 
Fig. 9e). Such Purcell factors imply photon emission times below 
100 femtoseconds, seen as the fi/g ~30-femtosecond Rabi flopping, 
but very short to measure directly. 

To probe single-molecule strong coupling, we systematically decrease 
the number of methylene-blue molecules by reducing the ratio of meth- 
ylene blue to cucurbit[7]uril. Previous studies and simple area estimates 
imply that 100 cucurbit[7]uril molecules lie inside each nanocavity 
(Supplementary Fig. 9). With the initial 1:10 molar ratio of methylene 


Methylene blue 
— Inside cucurbit[7]uril 
--- Near cucurbit[5]uril 

— In water 


o¢ 


Absorbance (arbitrary units) 


600 650 700 
Wavelength (nm) 


0.0 
550 750 


Figure 2 | Plasmonic nanocavity containing a dye molecule. 

a, Absorption spectra of methylene blue in water, with (blue) and without 
(red) encapsulation in cucurbit[n]urils of different diameters (dashed and 
solid red lines). Icons show individual molecules (in blue; line centred at wo) 
and paired molecular dimers (in red). b, Illustration of a methylene-blue 
molecule in cucurbit[n]uril, in the nanoparticle-on-mirror geometry 

used here. 


© 2016 Macmillan Publishers Limited. All rights reserved 


a b 20 

10 
> o 
= Q 
2 = 
= 2 

= o O 
= 8 

8 2 20 
Zz 

10 

1 fi 0 

500 600 700 800 500 600 


Wavelength (nm) 


Figure 3 | Strong coupling seen in scattering spectra of individual 
NPoMs. a, Scattering spectra resulting from isolated NPoMs according to 
the orientation of the emitter (the methylene-blue dye; see insets). With 
the dye transition dipole moment, fz, oriented parallel to the mirror, 

the resonant scattering plasmonic peak (w») is identical to that of NPoMs 
without any emitters. With 2m oriented parallel to the mirror, split peaks 
result from the strong interaction between the emitter and the plasmon. 


blue:cucurbit[7]uril, the mean number (7) of methylene-blue mole- 
cules within each mode volume is thus 10. We explore many plasmonic 
nanocavities with a mean dye number of 10 or less (Fig. 4a). From the 
resulting spectra, we extract coupling strengths at different mean dye 
numbers, and plot these along with the predicted coupling strength: 


A4nhnc 


20 


Wavelength (nm) 


LETTER 


Dp 
Z 1.9} % | 
= cm oamo 0 0 cog @888 coo] 
a C) 
3 o* 
& 1.8 e? 
om” 
i> 
1.7 
1.6 
700 800 0 40 


b= ,- Wo (meV) 


The blue dashed line indicates the dye’s absorption wavelength (centred 
at wo). b, Comparison of scattering spectra from different NPoMs (see 
insets), whose gaps are filled by a cucurbit[7]uril monolayer that is empty 
(top), or encapsulating dye molecules (bottom). c, Resonant positions of 
methylene-blue (wo), plasmon (w,) and hybrid modes (w, and w_) asa 
function of extracted detuning. The symbol size depicts the amplitude in 
scattering spectra. 


where {2 = 3.8D is the transition dipole moment of isolated methylene- 
blue molecules”. The probability of finding each coupling strength 
(Fig. 4a, colour map) follows the Poisson distribution for n molecules 
under each nanoparticle. The range of Rabi splittings seen for #7 =2.5 that 
exceed thermal- and cavity-loss rates at room temperature, is consistent 
with the idea that our plasmonic nanocavity is supporting single-molecule 
strong coupling. Reassuringly, the range of Rabi frequencies observed 
increases as the molecular concentration is reduced, as would be expected 


vu 
9. 
n 
3 g 2 ; 
i= 3 5 50 420 440 460 420 440 460 
=~ aS] Q Raman shift (cm-’) 
2) 3 7) 
oO 10 a ‘5 
£ ion 
2 = 3] Null events = 208 
rot g 2 
7) mS S 
3 Ss 2 
o 
jag 
0 
0 4 8 12 
es 0.0 0.2 0.4 0.6 0.8 1.0 
Mean number of dye molecules, n a 
Probability of dye event 
b 200 ©) One molecule d Two molecules | ©| Three molecules 
3 
= 
m 1505 Four, five 
s and|six 
D mol: 
a 
2 
@ 
2 100 
a 
=i 
fo} 
oO 
0 10 20 30 40 50 600 700 600 700 600 700 


Nanoparticle number 


Figure 4 | Rabi splitting from few molecules. a, Energy of Rabi 
oscillations (g) versus mean number of dye molecules (7). Experimental 
(white) points are shown, together with the range of measured coupling 
strengths (error bars) compared with the theoretical curve (dashed line). 
The colours represent the Poisson probability distribution of 7. 

b, Coupling strength extracted from different NPoMs in a sample of 

n= 2.5.The bars show the theoretical coupling strength obtained from a 


Wavelength (nm) 


perfect model; the dashed lines show a random-placement model. c-e, 
Scattering spectra for one, two and three molecules (corresponding to b), 
with fits. f, Single-molecule probability histograms for 7 = 0.2 and 2.5, 
derived from modified principal-component analysis (Supplementary 
Fig. 13). The yellow bars show single-molecule events. The insets show 
the Raman signatures of the two different types of molecular event. 


00 MONTH 2016 | VOL 000 | NATURE | 3 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


given that Ag(7) x Vi+a'/? —Jm—'/? similarly increases, as 
observed in Fig. 4a (colour map). 

Direct proof of single-molecule strong coupling is seen from the 
coupling strengths extracted from the lowest-density samples (# = 2.5): 
these show distinct, systematic jumps matching the expected increase 
in g, as n rises from one to three dye molecules (Fig. 4b; NPoMs are 
sorted according to increasing Rabi splitting). The range in each value 
of g, arises because single molecules are located at different lateral posi- 
tions within the gap plasmon, thus coupling with different strengths 
(predictions are shown as dashed lines in Fig. 4b). Experimentally, we 
find excellent agreement (with no fitting parameters), showing that a 
single methylene-blue molecule in our nanocavities gives Rabi split- 
tings of 80-95 meV. Further, we plot the scattering spectrum from 
n= 1-3 molecules, revealing clear increases in coupling strength. 
Additional proof of the single-molecule strong coupling is seen from 
the anticrossing of plasmon and exciton modes for the subset with n = 1 
(Supplementary Fig. 20). 

For weakly coupled single molecules, emitted fluorescence should 
follow the Purcell factor”*°. However, such measurements fail here in 
the strong-coupling regime, because resonantly pumping the molecular 
absorption also generates strong surface-enhanced resonant Raman 
scattering (SERRS)—consisting of sharp lines with a strong 
background—that cannot be uniquely separated from photolumines- 
cence (Supplementary Fig. 11). This also obscures the g”) measurements 
that are typically used to confirm single-photon emission from indi- 
vidual chromophores. Here we find extremely strong emission—even 
though the dye molecules are within 0.5 nm of absorptive gold?” — 
owing to the high radiative efficiency of our nanocavities. We harvest 


these strong SERRS peaks to construct ‘chemical’ go values, by using 
the well established bianalyte technique with a second near-identical 
but distinguishable molecule to prove single-molecule statistics 
(Fig. 4f and Supplementary Figs 15-17). As clearly evident, at the lowest 
concentrations two molecules are almost never found at the same time, 
and we are truly in the single-molecule regime. Although this does not 
guarantee direct correlation with single-molecule strong-coupling sit- 
uations, it does prove the statistical probability of single molecules at 
this concentration. Convincing proof of the presence of single mole- 
cules is also provided by the spectral diffusion of vibrational lines in 
time-series SERRS scans from nanoparticles exhibiting single-molecule 
strong coupling (Supplementary Figs 18 and 19). 

We have succeeded in combining the gap plasmon with oriented 
host-guest chemistry in aqueous solution to create enormous numbers of 
strongly coupled, few-molecule nanocavities at room temperature, in ambi- 
ent conditions, and which are optically addressable. We envisage numer- 
ous applications, including single-photon emitters, photon blockades’, 
quantum chemistry~**°, nonlinear optics, and tracked or directed 
molecular reactions. 


Received 16 July 2015; accepted 1 April 2016. 
Published online 13 June 2016. 


1. Tame, M. S. et al. Quantum plasmonics. Nature Phys. 9, 329-340 (2013). 

2. Koenderink, A. F., Alu, A. & Polman, A. Nanophotonics: shrinking light-based 
technology. Science 348, 516-521 (2015). 

3. Sato, Y. et al. Strong coupling between distant photonic nanocavities and its 
dynamic control. Nature Photon. 6, 56-61 (2012). 

4. Liu, X. et al. Strong light-matter coupling in two-dimensional atomic crystals. 
Nature Photon. 9, 30-34 (2015). 

5. Yoshie, T. et al. Vacuum Rabi splitting with a single quantum dot in a photonic 
crystal nanocavity. Nature 432, 200-203 (2004). 

6. Thompson, J. D. et a/. Coupling a single trapped atom to a nanoscale optical 
cavity. Science 340, 1202-1205 (2013). 

7. Faraon, A. et al. Coherent generation of non-classical light on a chip via 
photon-induced tunnelling and blockade. Nature Phys. 4, 859-863 (2008). 


4 | NATURE | VOL 000 | 00 MONTH 2016 


8. Grdblacher, S. et al. An experimental test of non-local realism. Nature 446, 
871-875 (2007). 

9. Coles, D. M. et a/. Strong coupling between chlorosomes of photosynthetic 
bacteria and a confined optical cavity mode. Nature Commun. 5, 5561 
(2014). 

10. Shalabney, A. et al. Coherent coupling of molecular resonators with a 
microcavity mode. Nature Commun. 6, 5981 (2015). 

11. Torma, P. & Barnes, W. L. Strong coupling between surface plasmon polaritons 

and emitters: a review. Rep. Prog. Phys. 78, 013901 (2015). 

12. Novotny, L. & Hecht, B. Principles of Nano-Optics (Cambridge Univ. Press, 2006). 

13. Zengin, G. et al. Realizing strong light-matter interactions between single- 

nanoparticle plasmons and molecular excitons at ambient conditions. 

Phys. Rev, Lett. 114, 157401 (2015). 

14. Zengin, G. et al. Approaching the strong coupling limit in single plasmonic 

nanorods interacting with J-aggregates. Sci. Rep. 3, 3074 (2013). 

15. Schlather, A. E., Large, N., Urban, A. S., Nordlander, P. & Halas, N. J. Near-field 

mediated plexcitonic coupling and giant Rabi splitting in individual 

metallic dimers. Nano Lett. 13, 3281-3286 (2013). 

16. Ciraci, C. et al. Probing the ultimate limits of plasmonic enhancement. Science 
337, 1072-1074 (2012). 

17. de Nijs, B. et a/. Unfolding the contents of sub-nm plasmonic gaps using 
normalising plasmon resonance spectroscopy. Faraday Discuss. 

178, 185-193 (2015). 

18. Benz, F. et a/. Nanooptics of molecular-shunted plasmonic nanojunctions. 
Nano Lett. 15, 669-674 (2015). 

19. Kasera, S., Herrmann, L. O., del Barrio, J., Baumberg, J. J. & Scherman, O. A. 
Quantitative multiplexing with nano-self-assemblies in SERS. Sci. Rep. 4, 
6785 (2014). 

20. Netzer, F. P. & Ramsey, M. G. Structure and orientation of organic molecules on 

metal surfaces. Crit. Rev. Solid State Mater. Sci. 17, 397-475 (1992). 

21. Vahala, K. J. Optical microcavities. Nature 424, 839-846 (2003). 

22. Khitrova, G., Gibbs, H. M., Kira, M., Koch, S. W. & Scherer, A. Vacuum Rabi 

splitting in semiconductors. Nature Phys. 2, 81-90 (2006). 

23. Patil, K., Pawar, R. & Talap, P. Self-aggregation of methylene blue in aqueous 

medium and aqueous solutions of Bu4NBr and urea. Phys. Chem. Chem. Phys. 

2, 4313-4317 (2000). 

24. Akselrod, G. M. et a/. Probing the mechanisms of large Purcell enhancement in 

plasmonic nanoantennas. Nature Photon. 8, 835-840 (2014). 

25. Anger, P., Bharadwaj, P. & Novotny, L. Enhancement and quenching of 
single-molecule fluorescence. Phys. Rev. Lett. 96, 113002 (2006). 

26. Kinkhabwala, A. et a/. Large single-molecule fluorescence enhancements 
produced by a bowtie nanoantenna. Nature Photon. 3, 654-657 (2009). 

27. Kravtsoy, V., Berweger, S., Atkin, J. M. & Raschke, M. B. Control of plasmon 
emission and dynamics at the transition from classical to quantum coupling. 
Nano Lett. 14, 5270-5275 (2014). 

28. Hutchison, J. A., Schwartz, T., Genet, C., Devaux, E. & Ebbesen, T. W. Modifying 
chemical landscapes by coupling to vacuum fields. Angew. Chem. Int. Ed. 51, 
1592-1596 (2012). 

29. Galego, J., Garcia-Vidal, F. J. & Feist, J. Cavity-induced modifications of 
molecular structure in the strong coupling regime. Phys. Rev. X 5, 041022 
(2015) 

30. Feist, J. & Garcia-Vidal, F. J. Extraordinary exciton conductance induced by 
strong coupling. Phys. Rev. Lett. 114, 196402 (2015). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements We acknowledge financial support from the UK’s 
Engineering and Physical Sciences Research Council (grants EP/G060649/1, 
EP/NO20669/1, EP/LO27151/1 and EP/I012060/1) and the European 
Research Council (grant LINASS 320503). This study was partially supported 
by the Air Force Office of Scientific Research (AFOSR); the European Office 

of Aerospace Research and Development (EOARD) is also acknowledged. 

R.C. acknowledges support from the Dr. Manmohan Singh scholarship from 
St John’s College, University of Cambridge. F.B. acknowledges support from 
the Winton Programme for the Physics of Sustainability. S.J.B. acknowledges 
support from the European Commission for a Marie Curie Fellowship 
(NANOSPHERE, 658360). 


Author Contributions J.J.B. and R.C. conceived and designed the experiments. 
R.C. performed the experiments with input from F.B. and B.d.N. R.C. and A.D. 
carried out the simulation and the analytical modelling with input from J.J.B., 
PF, O.H. and E.R. R.C. and J.J.B. analysed the data. SJ.B. and O.A.S. synthesized 
cucurbit[n]Juril and provided input on the fabrication and characterization of 
samples. R.C. and J.J.B. wrote the manuscript with input from all authors. 


Author Information Data supporting this paper are available at https://www. 
repository.cam.ac.uk/handle/1810/254579. Reprints and permissions 
information is available at www.nature.com/reprints. The authors declare no 
competing financial interests. Readers are welcome to comment on the online 
version of the paper. Correspondence and requests for materials should be 
addressed to J.J.B. (jjb12@cam.ac.uk). 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


doi:10.1038/nature18284 


Lanthanum-catalysed synthesis of microporous 3D 
graphene-like carbons in a zeolite template 


Kyoungsoo Kim!, Taekyoung Lee’, Yonghyun Kwon!*, Yongbeom Seo!, Jongchan Song®, Jung Ki Park?, Hyunsoo Lee!, 
Jeong Young Park!+, Hyotcherl Ihee!?, Sung June Cho® & Ryong Ryoo!? 


Three-dimensional graphene architectures with periodic 
nanopores—reminiscent of zeolite frameworks—are of topical 
interest because of the possibility of combining the characteristics 
of graphene with a three-dimensional porous structure. Lately, 
the synthesis of such carbons has been approached by using zeolites 
as templates and small hydrocarbon molecules that can enter the 
narrow pore apertures’ !°. However, pyrolytic carbonization of the 
hydrocarbons (a necessary step in generating pure carbon) requires 
high temperatures and results in non-selective carbon deposition 
outside the pores. Here, we demonstrate that lanthanum ions 
embedded in zeolite pores can lower the temperature required 
for the carbonization of ethylene or acetylene. In this way, a 
graphene-like carbon structure can be selectively formed inside the 
zeolite template, without carbon being deposited at the external 
surfaces. X-ray diffraction data from zeolite single crystals after 
carbonization indicate that electron densities corresponding to 
carbon atoms are generated along the walls of the zeolite pores. 
After the zeolite template is removed, the carbon framework 
exhibits an electrical conductivity that is two orders of magnitude 
higher than that of amorphous mesoporous carbon. Lanthanum 
catalysis allows a carbon framework to form in zeolite pores 
with diameters of less than 1 nanometre; as such, microporous 
carbon nanostructures can be reproduced with various topologies 
corresponding to different zeolite pore sizes and shapes. We 
demonstrate carbon synthesis for large-pore zeolites (FAU, EMT 
and beta), a one-dimensional medium-pore zeolite (LTL), and even 
small-pore zeolites (MFI and LTA). The catalytic effect is a common 
feature of lanthanum, yttrium and calcium, which are all carbide- 
forming metal elements. We also show that the synthesis can be 
readily scaled up, which will be important for practical applications 
such as the production of lithium-ion batteries and zeolite-like 
catalyst supports. 

Zeolites are a family of microporous crystalline aluminosilicate 
materials, which fall into more than 200 structural types!®. Each 
structural type is distinguished by its unique pore structure—for 
example, in terms of its pore diameters, shapes and connectivity!” 
The pore diameters are typically between 0.3 nm and 1.3 nm. Another 
important characteristic of zeolites is their ion-exchange capacity'*". 
Zeolite frameworks contain cations to compensate for the negative 
charge at the aluminiums in the tetrahedral silicate framework. The 
cations—which, as synthesized, are normally sodium or ammonium 
ions—can be exchanged with other cations through a solution-based 
conventional ion-exchange process. 

In recent years, zeolites have attracted attention as a template for 
carbon synthesis’~!*?°?!, The pores in many zeolites have diameters 
appropriate to accommodating fullerene and carbon nanotubes, 
and are interconnected along the smoothly curved surface to form 
a three-dimensional (3D) network that is open to the exterior. In 


principle, such a nanoporous system should be ideal as a template 
for synthesizing a 3D graphene architecture*"’. However, the zeolite 
pores are too small to accommodate bulky molecular compounds, 
such as sucrose, polyaromatic compounds, and furfuryl alcohol, 
which are commonly used for carbon synthesis with mesoporous 
silica templates?°”". Small molecules, such as ethylene and acet- 
ylene, are desirable as a carbon source for achieving successful 
carbonization within the zeolite pores. But carbonization of these 
small hydrocarbons generally requires high-temperature reactions to 
fix the carbon source inside the pores. At such high temperatures, the 
reactions tend to occur non-selectively on the external surfaces as 
well as on the internal pore walls'*-!°. This often results in coke being 
deposited at the external surfaces, causing serious diffusion limitations 
into the pores. 

Here we tackled this problem by using La** ions. We intuited that 
such a transition-metal element would bond with olefins, acetylenes 
and aromatic compounds through a d-7 coordination. If so, then the 
d-r interactions should stabilize ethylene and the pyrocondensation 
intermediates to form a carbon framework in zeolite. Then, we would 
expect carbonization to occur selectively inside the La**-containing 
zeolite pores. 

To test this hypothesis, we carried out ion exchange of an Na*- 
containing form of the zeolite faujasite-Y (FAU-Y; that is, NaY zeolite) 
with La**. We heated the resulting LaY zeolite under carbon-synthesis 
conditions using ethylene gas for 1 hour at different temperatures 
(see Methods). We analysed the amount of carbon deposition at each 
temperature by thermogravimetry, and plotted the analysis data as a 
function of temperature (Extended Data Fig. 1). We also compared 
these LaY data with the results obtained from other cation-containing 
forms of the zeolite, such as NaY and HY. The data indicate that the 
LaY, NaY and HY zeolite samples all show rapid carbon deposition 
at 800°C. However, as the temperature decreases, the different ionic 
forms behave dramatically differently: at 600°C, the LaY zeolite is still 
active as a carbon-deposition template, whereas both NaY and HY 
lose this function almost completely. This result highlights a catalytic 
effect of lanthanum on carbonization. Usually, in carbon synthesis, 
the proton form of zeolite is preferred as a template. This is due to 
the presence of Lewis and Bronsted acid sites that can catalyse the 
pyrocondensation of hydrocarbons into polymeric coke species”. 
But carbon deposition in LaY occurs more than 20 times faster than 
in such an acidic HY zeolite (based on our chosen ethylene flow for 
lhour at 600°C). The ethylene flow can also be safely prolonged until 
all internal pores are fully saturated with carbon; the deposition of any 
amorphous or graphitic carbon on external surfaces is still prevented 
(Extended Data Fig. 2). 

We investigated the carbon structure using solid-state, magic- 
angle spinning '?C nuclear magnetic resonance (NMR) spectros- 
copy after the deposition of C-labelled carbon in the LaY zeolite 


1Center for Nanomaterials and Chemical Reactions, Institute for Basic Science (IBS), Daejeon 305-701, Korea. 7Department of Chemistry, KAIST, Daejeon 34141, Korea. 7Department of Chemical 
and Biomolecular Engineering, KAIST, Daejeon 34141, Korea. “Graduate School of EEWS, KAIST, Daejeon 34141, Korea. 5Clean Energy Technology Laboratory and Department of Chemical 


Engineering, Chonnam National University, Gwangju 61186, Korea. 


00 MONTH 2016 | VOL 000 | NATURE | 1 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Figure 1 | Electron-density map of the supercage of zeolite FAU after 
carbon deposition. a, Three-dimensional electron-density map of the 
carbon framework formed at 600 °C, excluding the zeolite framework. 

b, c, Enlarged images of the electron-density map, including the zeolite 
framework (cyan), from different viewpoints: along the (110) axis (b) and 


(Extended Data Fig. 3). The NMR spectrum exhibits two slightly 
separated peaks (at 123 and 129 parts per million, p.p.m.). These 
NMR peaks can be interpreted as sp” carbon species in six-membered, 
and in five- or seven-membered, carbon rings respectively’>. Thus, 
all carbon atoms in the zeolite-carbon composite sample have an 
sp” hybridized bonding nature within the detection limit of the '°C 
NMR spectroscopy. 

Here, the question is whether the carbon structure is built systemati- 
cally like a 3D graphene along smoothly curved surfaces on pore walls, 
or exists randomly in the template pore volume. We sought the answer 
by studying X-ray diffraction (XRD) data of large single crystals of 
FAU after carbon deposition. Figure 1a shows an electron-density map 
of atoms that were brought into the zeolite micropore (designated the 
‘supercage’) during carbon deposition. We obtained this map by using 
the difference Fourier method with X-ray single-crystal diffraction 
data, which we collected after fully dehydrating the zeolite-carbon 
composite sample (to exclude moisture in the supercage) by flowing 
nitrogen gas at 600°C and then placing the sample in a vacuum at 
350°C (see Methods). All of the electron densities in the supercage 
can thus be attributed to carbonaceous frameworks. 

The electron densities indicate diffused atomic positions in the 
wide section of the supercage; these atomic positions correspond to 
a hexagonal ring of carbon atoms, as in a graphene net (Fig. 1b, c). 
In particular, the density map at cross-sectional cuts exhibits hollow 
images, indicating that the carbon atoms are systematically depos- 
ited along the zeolite supercage surface. However, in the narrow 
space between adjoining supercages, the electron densities are more 
diffuse and crowded. Some of the density portions are too close to 
assign carbon-carbon bonding. This can be interpreted as the average 
electron-density map superimposing various atomic positions over 
many identical pore necks—in other words, there is high static disor- 
der. Because of the severe disorder and fractional occupancy, the exact 
single-crystal structure was difficult to solve unless constraints were 
used in the refinement process (Extended Data Table 1, Extended Data 
Fig. 4, and Methods). 

The carbon framework obtained at 600°C can be separated from 
the template, by using a mixture of hydrogen fluoride and hydrochlo- 
ric acid to remove the zeolite. The carbon thus recovered exhibits a 
narrow distribution of pore diameters in the micropore region, corre- 
sponding to the thickness of the template walls. The carbon, however, 
exhibits only poorly resolved XRD peaks and transmission electron 
microscope (TEM) lattice fringes, indicating that the pores are not well 
ordered. The loss of pore order seems to result from insufficient for- 
mation of carbon-carbon bonds in the narrow necks between adjacent 
supercages at 600°C. To obtain a carbon product with highly ordered 
pores, the carbon-zeolite composite needs to be heated to 850°C after 
the carbon-deposition step at 600°C. This heat treatment involves a 
small lattice contraction of the zeolite and loss of more than half of the 
XRD peaks (Supplementary Fig. 1), indicating that the zeolite template 


2 | NATURE | VOL 000 | 00 MONTH 2016 


the (111) axis (c). The electron-density map corresponds to the electron- 
density difference between zeolite and the carbon-zeolite composite, and 
was obtained using the difference Fourier method. The iso-surface level 
of the electron density is set to 0.25 electrons per A? (yellow) and 0.35 
electrons per A? (red). White areas represent cross-sectional cuts. 


behaves like a shrinking mould to allow a rigid carbon framework to 
form. 

The final carbon product—which is liberated from LaY after heating 
at 850 °C—is an exact replica of the zeolite pore structure. This carbon 
exhibits highly (for a microporous carbon) well ordered structures in 
TEM images and powder XRD patterns (Fig. 2 and Supplementary 
Fig. 2). The TEM images show no carbon deposition on external 
surfaces. Approximately 80% of the zeolite pores are replicated with 
carbon (see Supplementary Discussion and Supplementary Fig. 3 for 
a more detailed quantitative analysis of the ‘quality’ of carbon rep- 
lication). This carbon exhibits high thermal stability in air, as com- 
pared with mesoporous carbons that are composed of amorphous 
frameworks‘, or with other zeolite-templated carbons that are 
synthesized by two-step carbon infiltration into H* or Na* zeolites 
(Extended Data Fig. 5). The thermal stability of the carbon is com- 
parable to that of graphene nanosheets. The high thermal stability 
and well ordered structure can be attributed to nothing but the effect 
of lanthanum on carbon deposition. To check this effect, we carried 
out La>*-ion-exchange into EMT and beta zeolites, and used these 
zeolites as templates. The resulting carbon frameworks also exhibit 
high thermal stability and a highly ordered microporous structure 
(Fig. 2, Supplementary Fig. 2 and Extended Data Fig. 5). 

We investigated the possibility of observing a graphene-like atomic 
arrangement by means of a high-resolution TEM instrument (see 
Methods), but we failed to obtain direct atomic images. The carbon 
framework was instantly damaged under the atomic-scale observation 
condition (which requires an electron beam of very high intensity), 
even when the electron-acceleration voltage was reduced to 80kV. In 
an alternative attempt, we took a selected-area electron-diffraction 
(SAED) pattern of the carbon synthesized using LaY (Fig. 2e). The 
SAED pattern showed two low-intensity diffraction rings at the 
same Bragg angles as graphene (100) and (110) reflections”’, unlike 
the SAED pattern of amorphous carbon. We interpret the SAED 
pattern of the LaY-templated carbon as revealing random orienta- 
tions of six-membered carbon rings existing in a curved, single-layer 
graphene-like structure. This sp” carbon character is confirmed by the 
NMR spectrum (Extended Data Fig. 3). Moreover, a Raman spectrum 
shows a strong G-band in addition to a D-band (Extended Data Fig. 6), 
much like in previous reports on curved nanographene samples””°. 
The G-band is upshifted from the position of graphite; such a shift is 
often attributed to a curvature in the graphene structure®*®. Another 
notable feature of the LaY-templated carbon is high electrical conduc- 
tivity (Fig. 2f and Extended Data Fig. 7). We investigated local elec- 
trical conductance using conductive probe atomic force microscopy. 
The result indicates that electrical conductivity of the LaY-templated 
carbon is two orders of magnitudes higher than that of the mesoporous 
carbon CMK-3, which has an amorphous framework‘. 

Given these results, we tested the possibility of using other metal 
ions for ion exchange. We chose Y?* and Ca?", because these metal 


© 2016 Macmillan Publishers Limited. All rights reserved 


N\A Beta-templated carbon 


Intensity (arbitrary units) 


a EMT-templated carbon 
FAU-templated carbon 
T T T 
10 20 30 


20 (degrees) 


Figure 2 | Structures of 3D graphene-like microporous carbons. 

a-c, TEM images (main pictures) and Fourier diffractograms (insets) of 
template-free carbon, generated using a template of La*t-ion-exchanged 
FAU (a), EMT (b) or beta (c) zeolites. The images reveal an ordered 
pore structure without any external carbon. d, Powder XRD patterns 
from the carbon samples, measured using synchrotron radiation. The 
patterns reveal the highly ordered microporous structure of the carbons, 


ions are known to interact via coordination bonding with the carbon 
framework, using electron-donation/back-donation mechanisms?”8 
in a similar way to La**. Indeed, the exchange of Y** or Ca”* with 
ions in FAU-Y dramatically increases the rate of carbon deposition 
at 600°C, much as does the LaY zeolite (Supplementary Fig. 4). The 
final carbon products from these zeolite templates exhibit highly 
ordered microporous structures. Another critical factor affecting 
carbon deposition is that water vapour should be fed into the ethyl- 
ene gas stream. Without water vapour, none of the La?+-, Y°+- and 
Ca*t-ion-exchanged zeolites shows sufficient carbon deposition. 
We speculate that this phenomenon could be related to the produc- 
tion of carbides, which the three metal elements used here can form. 
Typical carbide formation in the bulk state requires high-temperature 
treatments in an electric arc furnace. However, when metal ions are 
atomically dispersed in a zeolite framework, a carbide might form 
even at 600°C. If so, then as the carbide reacts with water vapour, 
active carbonaceous species might be generated to construct carbon 
frameworks. 

Acetylene gas can be used instead of ethylene to construct carbon 
frameworks on the ion-exchanged zeolites. Because acetylene is more 
reactive than ethylene, carbon deposition can be accomplished at tem- 
peratures as low as 340°C (Supplementary Fig. 5). In addition, the 
smaller molecular size of acetylene enables uniform infiltration of 
carbon—even in zeolites with one-dimensional channels (for example, 
LIL zeolite) or small pore mouths (LTA and MFI zeolites), which have 
been difficult to use as carbon templates!” (Fig. 3). The LTL zeolite 
has a one-dimensional (1D) undulating channel, with narrow sections 
of diameter 0.71 nm and wide sections of diameter 1.24nm, which 
alternate with a 0.48-nm periodicity. Accordingly, undulating carbon 
tubes can be synthesized inside the La**-exchanged LTL zeolite. When 


LETTER 


d=1.11 nm foe 


e FAU-templated 


carbon 
4 Mesoporous 
carbon 
ee Y Gold (111) 
<x 
3 substrate 
= 
g 
5 
oO 
-20 
-40 
-0.6 -0.3 0.0 0.3 0.6 
Applied bias (V) 


corresponding to the pore structure of the template. e, Electron-diffraction 
pattern of a selected area from the FAU-templated carbon, showing 
low-intensity rings corresponding to graphene (100) and (110) reflections. 
f, Current-voltage curves for FAU-templated carbon and CMK-3 
mesoporous carbon on a gold (111) substrate, measured by conductive 
probe atomic force microscopy. 


the template walls are removed after heating at 850°C, the carbon 
tubes self-assemble to form a bundle (Fig. 3a, b). In contrast, if we use 
the Nat or H* form of LTL zeolite, carbon deposition occurs only at 
the external surfaces of the template (Extended Data Fig. 8). 

Meanwhile, in the Ca?t-exchanged LIA zeolite, the pore diameter 
is 1.14nm; the pores are interconnected to a 3D network, but the pore 
mouths (of diameter 0.5 nm) are too narrow to have previously con- 
sidered using LTA as a carbon template. Nonetheless, our results show 
that carbon infiltrates quite uniformly throughout the entire volume 
of this zeolite. The final carbon product, liberated from the template, 
exhibits the crystal morphology of the zeolite template (Fig. 3c), and 
lattice fringes (Fig. 3e). The carbon crystal can be easily crushed by 
hand rubbing. The crushed crystal surfaces indicate that the entire 
volume of the zeolite crystal is used for carbon synthesis (Fig. 3d). 
Notably, the carbons obtained from the LTL and LTA zeolites can be 
dispersed in N-methylpyrrolidone (NMP); the solutions show pho- 
toluminescence, indicating that they are soluble in organic solvents 
(Fig. 3f). Moreover, the LTA-templated carbon recrystallizes when 
isopropyl alcohol is added to the NMP solution, indicating that the 
carbon products show van der Waals packing of carbon nanotubes 
or carbon dots. 

Compared with the LTL and LTA zeolites, the MFI zeolite is some- 
what more difficult to use to accomplish carbon synthesis. This is 
because this zeolite has only narrow channels (<0.56 nm in diameter), 
without bulged sections. The channels are too narrow to accommodate 
even Co fullerene. Nevertheless, our results show that these narrow 
pores can still be used as a template for a carbon nanostructure, and 
that the morphology of the resulting carbons closely resembles that of 
the template. The carbon exhibited a sharp peak centred at 0.49 nm in 
the pore size distribution (Supplementary Fig. 6). This corresponds to 


00 MONTH 2016 | VOL 000 | NATURE | 3 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Figure 3 | Carbon from a 1D-channel LTL zeolite, and from a small- 
pore LTA zeolite. a, Scanning electron microscope (SEM) and b, TEM 
images of LTL-templated carbon, revealing morphologies corresponding 
to a bundle of 1D channels. c, d, SEM and e, TEM images of LTA- 
templated carbon, exhibiting zeolite-like crystal morphologies and pore 
order; these soft carbon crystals can be easily crushed. The SEM image 
of the broken crystal surfaces (d) shows that carbon synthesis uses the 
entire volume of the zeolite crystal. f, Photographs of the LTA- and LTL- 
templated carbons dispersed in NMP solution, in comparison with FAU. 


the thickness of the MFI pore walls, indicating the formation of rigid 
carbon nanostructures inside the narrow zeolite channel. 

Making graphene with 3D periodic nanoporous architectures prom- 
ises a range of useful applications, such as in batteries and catalysts??*, 
but has not yet seen full success owing to the lack of efficient syn- 
thetic strategies. Our protocol, with its pore-selective carbon filling at 
decreased temperatures, can be readily scaled up for studies requiring 
bulk quantities of carbon (Extended Data Fig. 9). Moreover, the high 
electrical conductivity of the resulting carbon frameworks will be 
useful in battery applications (Supplementary Fig. 7). 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 22 October 2015; accepted 20 April 2016. 
Published online 29 June 2016. 


1. Han, S., Wu, D., Li, S., Zhang, F. & Feng, X. Porous graphene materials for 
advanced electrochemical energy storage and conversion devices. Adv. Funct. 
Mater. 26, 849-864 (2014). 

2. Jiang, L. & Fan, Z. Design of advanced porous graphene materials: from 
graphene nanomesh to 3D architectures. Nanoscale 6, 1922-1945 (2014). 

3. Cao, X., Yin, Z. & Zhang, H. Three-dimensional graphene materials: preparation, 
structures and application in supercapacitors. Energy Environ. Sci. 7, 
1850-1865 (2014). 

4. Vanderbilt, D. & Tersoff, J. Negative-curvature fullerene analog of Ceo. Phys. Rev. 
Lett. 68, 511-513 (1992). 

5, ackay, A. L. & Terrones, H. Diamond from graphite. Nature 352, 762 (1991). 

6. Lenosky, T., Gonze, X., Teter, M. & Elser, V. Energetics of negatively curved 

graphitic carbon. Nature 355, 333-335 (1992). 

7. a, Z., Kyotani, T. & Tomita, A. Preparation of a high surface area microporous 

carbon having the structural regularity of Y zeolite. Chem. Commun. 23, 

2365-2366 (2000). 

8. a, Z., Kyotani, T., Liu, Z., Terasaki, O. & Tomita, A. Very high surface area 

microporous carbon with a three-dimensional nano-array structure: synthesis 

and its molecular structure. Chem. Mater. 13, 4413-4415 (2001). 


4 


ATURE | VOL 000 | 00 MONTH 2016 


Solvent : 
N-methyl-2-pyrrolidone 


Ultraviolet 
light 


The carbon products from the LTA and LTL templates are soluble and 
exhibit photoluminescence under ultraviolet light, but the carbon from 
the FAU template is insoluble. In the latter case, the carbon is synthesized 
in the form of a 3D porous network that extends over several supercages, 
rendering it insoluble in organic solvents. However, in the case of the LTA 
and LTL zeolites, the carbon is obtained as quantum dots or nanotubes; 
these small carbon objects, which are like fullerenes and carbon 
nanotubes, are soluble in organic solvents. 


9. Nishihara, H. et al. A possible buckybowl-like structure of zeolite templated 
carbon. Carbon 47, 1220-1230 (2009). 

10. Nueangnoraj, K. et a/. Formation of crosslinked-fullerene-like framework as 
negative replica of zeolite Y. Carbon 62, 455-464 (2013). 

11. Parmentier, J., Gaslain, F. 0. M., Ersen, O., Centeno, T. A. & Solovyov, L. A. 
Structure and sorption properties of a zeolite-templated carbon with the EMT 
structure type. Langmuir 30, 297-307 (2014). 

12. Kyotani, T., Ma, Z. & Tomita, A. Template synthesis of novel porous carbons 
using various types of zeolites. Carbon 41, 1451-1459 (2003). 

13. Yang, Z., Xia, Y. & Mokaya, R. Enhanced hydrogen storage capacity of high surface 
area zeolite-like carbon materials. J. Am. Chem. Soc. 129, 1673-1679 (2007). 

14. Yang, Z., Xia, Y., Sun, X. & Mokaya, R. Preparation and hydrogen storage 
properties of zeolite-templated carbon materials nanocast via chemical vapor 
deposition: effect of the zeolite template and nitrogen doping. J. Phys. Chem. B 
110, 18424-18431 (2006). 

15. Zhou, J. et al. Effect of cation nature of zeolite on carbon replicas and their 
electrochemical capacitance. Electrochim. Acta 89, 763-770 (2013). 

16. Corma, A. From microporous to mesoporous molecular sieve materials and 
their use in catalysis. Chem. Rev. 97, 2373-2420 (1997). 

17. Davis, M. E. & Lobo, R. F. Zeolite and molecular sieve synthesis. Chem. Mater. 4, 
756-768 (1992). 

18. Davis, M. E. Ordered porous materials for emerging applications. Nature 417, 
813-821 (2002). 

19. Corma, A. State of the art and future challenges of zeolites as catalysts. J. Catal. 
216, 298-312 (2003). 

20. Kyotani, T., Nagai, T., Inoue, S. & Tomita, A. Formation of new type of porous 
carbon by carbonization in zeolite nanochannels. Chem. Mater. 9, 609-615 
(1997). 

21. Johnson, S.A. Brigham, E. S., Ollivier, P. J. & Mallouk, T. E. Effect of micropore 

opology on the structure and properties of zeolite polymer replicas. Chem. 

Mater. 9, 2448-2458 (1997). 

22. Guisnet, M. & Magnoux, P. Organic chemistry of coke formation. Appl. Catal. 

A Gen. 212, 83-96 (2001). 

23. Deschamps, M. et al. Exploring electrolyte organization in supercapacitor 

electrodes with solid-state NMR. Nature Mater. 12, 351-358 (2013). 

24. Jun, S. et al. Synthesis of new, nanoporous carbon with hexagonally ordered 

mesostructured. J. Am. Chem. Soc. 122, 10712-10713 (2000). 

25. Liu, X. Giordano, C. & Antonietti, M. A facile molten-salt route to graphene 

synthesis. Smal// 10, 193-200 (2014). 

26. Ning, G. et al. Gram-scale synthesis of nanomesh graphene with high surface 
area and its application in supercapacitor electrodes. Chem. Commun. 47, 
5976-5978 (2011). 


© 2016 Macmillan Publishers Limited. All rights reserved 


27. Chakraborty, B., Modak, P. & Banerjee, S. Hydrogen storage in yttrium-decorated 
single walled carbon nanotube. J. Phys. Chem. 116, 22502-22508 (2012). 

28. Yoon, M. et a/. Calcium as the superior coating metal in functionalization of 
carbon fullerenes for high-capacity hydrogen storage. Phys. Rev. Lett. 100, 
206806 (2008). 

29. Odkhuu, D. et al. Negatively curved carbon as the anode for lithium ion 
batteries. Carbon 66, 39-47 (2014). 

30. Zhai, Y., Zhu, Z. & Dong, S. Carbon-based nanostructures for advanced 
catalysis. ChemCatChem 7, 2806-2815 (2015). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements This work was supported by IBS-ROO4-D1. The authors 
thank D. Ahn at the Pohang Accelerator Laboratory (PLS) for discussions on 
powder XRD measurements. X-ray crystallography was carried out with help 
from D. Moon at PLS and H. J. Lee at Korea Basic Science Institute. 


LETTER 


Author Contributions R.R. selected metal-ion catalysts intuitively, initiated 
single-crystal investigation, and led the project. K.K. led the synthesis and 
characterization work, with T.L. and Y.K. Y.S. carried out NMR measurements. 
J.S. and J.K.P. carried out electrochemical analysis. H.L. and J.Y.P. analysed the 
electrical conductivity of the carbon product. S.J.C. and T.L. carried out the 
X-ray crystallography. H.I. investigated the mechanism of carbon formation. 
R.R. and K.K. wrote the manuscript. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of the paper. 
Correspondence and requests for materials should be addressed to 

R.R. (rryoo@kaist.ac.kr). 


Reviewer Information Nature thanks P. de Jongh, L. Solovyov and the other 


anonymous reviewer(s) for their contribution to the peer review of this 
work. 


00 MONTH 2016 | VOL 000 | NATURE | 5 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


METHODS 

Preparation of zeolite templates. Zeolite beta (Si/Al ratio= 12.5) was purchased 
from Zeochem, and MFI (Si/Al= 11.5) from Zeolyst. Other zeolites were synthe- 
sized according to literature procedures*!“. For X-ray crystallography, we syn- 
thesized zeolite FAU with large single-crystal morphology*’. Ion exchange was 
performed with aqueous solution of metal salts. 

Synthesis of carbon materials. Zeolite was heated to 600°C under a dry N> flow, 
using a vertically placed, fused quartz reactor equipped with a fritted disk. We 
passed a mixture of ethylene gas, N2 and steam through the zeolite bed at 600°C. 
The gas flow was switched to dry Nz when carbon deposition was completed. Then, 
we increased the temperature to 850°C, and maintained it there for 2 hours. The 
resulting product was slurried in a 0.3 M HF/0.15 M HCl solution, or alternatively 
in concentrated HCl followed by hot 2M NaOH solution, to release carbon from 
template. The HF-etched carbon products exhibited an oxygen/carbon ratio of 
0.009 in molar ratio, while the carbon samples obtained with NaOH washing had 
an oxygen/carbon ratio of 0.10. 

Collection of crystallographic data. A single crystal (about 351m in diameter; see 
Supplementary Fig. 8) of La FAU zeolite containing carbons, treated at 600°C, was 
dehydrated by flowing high-purity N> gas at 600°C for 2 hours, and then put under 
vacuum at 350°C for 2 hours, to fully exclude moisture. A crystal was coated with a 
layer of Paratone oil (pre-dehydrated for 48 hours) in a glove box, in order to pre- 
vent moisture absorption during sample mounting and XRD measurements (see 
Supplementary Fig. 9 for experimental verification through a gravimetric meas- 
urement). The coated crystal was measured at 123 K over the range of 20=5°-149°, 
using a Bruker D8-Venture diffractometer with graphite-monochromated Cuka 
radiation (A=0.15418 nm) and a Photon 100 CMOS detector, at the Korea Basic 
Science Institute (exposure time = 120 seconds per frame). The Bruker APEX2 
program was used for data collection, and SAINT was used for cell refinement 
and reduction™. Absorption correction was applied using the SADABS program™. 
Derivation of electron-density maps for carbon. XRD data, collected from the 
zeolite-carbon composite crystal, were analysed by means of full-matrix least- 
squares calculations based on F’ values with JANA2006 (ref. 36). We found the 
data to have an Rint (observed/all) value of 4.73/5.61 for 547/819 reflections, aver- 


aged from 3,996/7,829 reflections, with a redundancy and completeness of 9.559 
and 99.9%, respectively. Rint, the merging error, is given by Rint = a 
where Fo is the experimental structure factor. We obtained an electron-density map 
by using a dual space method with a charge flipping algorithm*’. We obtained the 
space group of Fd3m, with an overall agreement factor of 2.85, from the 
electron-density map**. We discovered a total of eight atoms in an asymmetric unit, 
all in the zeolite framework. Refinement of the zeolite framework structure was 
started after assigning correct atom species using the full-matrix least-squares 
procedure. The value of maximum (change/s.u.), which is used as a convergence 
criterion, decreased from 0.05 to 0.01 during this initial refinement process (s.u. 
is the standard uncertainty). The values of R,/wR2 (which indicate the agreement 
between the crystallographic model and the experimental X-ray diffraction data) 
were 18.73/43.00 when Si and O atoms were taken into account for the framework. 

For further refinement, including La and Na atoms, first, the scale factor of the 
zeolite crystal containing carbons was determined using the high-angle portion 
above sin6/\ = 0.25 corresponding to d< 0.2 nm*? (where d indicates the reso- 
lution, defined by the Breck equation, \ = 2dsin6 This process yielded Rj/wR> 
values of 9.74/21.04 for 747 unique reflections. The Rj/wR; values decreased to 
7.49/15.81 when occupancy factors for non-framework species (that is, Na, La, 
and O atoms bound to La) were refined. Second, anisotropic atomic displacement 
factors were used for all zeolite framework atoms and La atoms. Isotropic atomic 
displacement factors were used for non-framework atoms except La. This process 
further decreased Rj/wR; to 6.30/12.78. The resulting composition of the zeolite 
was |Lag3.32Nayo.16O12.12|[T(SiAl)O2] 192, which was consistent with that from the 
chemical analysis. 

In this stage, we tried to refine intrapore species using all reflections including 
the low-angle portion (that is, 819 unique reflections). The scale factor, determined 
above, was fixed until most of the missing intrapore atoms were assigned from a 
difference Fourier method. In a first trial of the difference Fourier method, we 


found eight peaks. Two peaks were assigned as oxygen that was coordinated to 
La in the zeolite framework, and six peaks were assigned as carbon in the zeolite 
pore space. The refinement of occupancy factors and positions of the obtained 
eight missing peaks resulted in R,/wR) decreasing to 7.07/17.91. The thermal dis- 
placement factor for all carbon atoms was set to a reasonable value, Ujso =0.08 A’, 
according to ref. 40. By using the difference Fourier method again, we also found 
three carbon atoms. Refinement of the additional carbon and oxygen yielded 
R,/wR; values of 6.66/15.72. Further inclusion of carbon atoms using the differ- 
ence Fourier method did not improve the R factors. Thus, by this stage, it could be 
considered that most of the missing atoms were found. The obtained composition 
was |C297 5gLa23,52Nai4.27026.69| [T(Si,Al)O2]192. The carbon content is comparable 
to the empirical carbon content (258 atoms per unit cell) obtained from elemental 
analysis of the sample. The obtained composition was changed slightly by the 
subsequent additional refinement of the structural parameters with the scale factor. 

In the last refinement, the occupancy factor of carbon atoms was fixed, and 
all carbon atoms were assumed to have the same atomic displacement param- 
eter. The constraints listed in a CIF file (see Supplementary information) were 
automatically generated from the symmetry operation, based on this assumption. 
Refinement using these constraints was continued until R,/wR values reached 
5.40/13.65 with 547 reflections for I> 3a(J) (where J is the reflection intensity). 
The largest difference peak was 0.50, and the deepest hole was —0.43 e A~*. The 
goodness-of-fitness index was 1.9. The final maximum (change/s.u.) was 0.0051. 
However, when all 819 reflections were taken into account, the R;/wR» values 
were 7.89/14.36. Moreover, attempts to refine each carbon atomic displacement 
parameter independently, without using constraints, failed to give stable conver- 
gence. In this regard, the above refinement result using constraints may not yet 
provide an accurate structural solution. After removing the carbon atoms obtained 
from structure refinement with constraints, we visualized an electron-density map 
corresponding to the carbon structure using the difference Fourier method; this 
map was equivalent to the difference between the total electron-density map and 
an electron-density map that corresponds to the zeolite framework. 
Characterization. We determined the carbon content in zeolites by thermogravim- 
etry using a TGA Q50 (TA Instruments). We collected powder XRD data using 
a monochromated synchrotron X-ray at Beamline 9B of the Pohang Accelerator 
Laboratory. TEM images and SAED patterns were collected with a Titan E-TEM 
G2 (FEI) at 300kV acceleration voltages, on a holey carbon grid (300 mesh) after 
supporting with ethanol dispersion. SEM images were taken with a Verios 460 (FEI) 
at a landing voltage of 1 kV in deceleration mode (stage bias voltage: 5 kV). 8C 
NMR spectra were acquired with magic-angle spinning using Brucker Avance III 
HD 400WB. Raman spectra were recorded on a Horiba Jobin Yvon ARAMIS spec- 
trometer with a laser excitation wavelength of 514nm. Electrical conductance was 
measured using an Agilent atomic force microscope 5500 in air, with a Pt/Ir-coated 
tip (PPP-EFM-50, Nanosensors). 


31. Breck, D. W. Zeolite Molecular Sieves (Wiley, 1974). 

32. Delprato, F., Delmotte, L., Guth, J. L. & Huve, L. Synthesis of new silica-rich 

cubic and hexagonal faujasites using crown-ether-based supramolecules as 

templates. Zeolites 10, 546-552 (1990). 

33. Ferchiche, S., Warzywoda, J. & Sacco, A., Jr Direct synthesis of zeolite Y with 

arge particle size. Int. J. Inorg. Mater. 3, 773-780 (2001). 

34, APEX2 and SAINT (Bruker AXS, 2014). 

35. Sheldrick, G. M. SADABS (Univ. Gottingen, 2008). 

36. Pettiéek, V., DuSek, M. & Palatinus, L. Crystallographic computing system 

JANA2006: general features. Z. Kristallogr. 229, 345-352 (2014). 

37. Palatinus, L. & Chapuis, G. SUPERFLIP—a computer program for the solution 

of crystal structures by charge flipping in arbitraray dimensions. J. Appl. Cryst. 

40, 786-790 (2007). 

38. Palatinus, L. & van der Lee, A. Symmetry determination following structure 

solution in P1. J. Appl. Cryst. 41, 975-984 (2008). 

39. McCusker, L. B., Von Dreele, R. B. & Cox, D. E., Louér, D. & Scardi, P. Rietveld 
refinement guidelines. J. Appl. Cryst. 32, 36-50 (1999). 

40. Xie, D., McCusker, L. B. & Baerlocher, C. Structure of the borosilicate zeolite 
catalyst SSZ-82 solved using 2D-XPD charge flipping. J. Am. Chem. Soc. 133, 
20604-20610 (2011). 

41. Biener, J. et al. Macroscopic 3D nanographene with dynamically tunable bulk 
properties. Adv. Mater. 24, 5083-5087 (2012). 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


0.35 
0.30 
LaY 
0.25 
0.20 
0.15 / 
0.10 


HY 


Carbon contents (g g” of zeolite) 


——=——" 
550 600 650 700 750 800 
Temperature (°C) 


Extended Data Figure 1 | Carbon deposition in zeolite FAU plotted asa 
function of temperature, with various ion exchanges. The zeolites LaY, 
NaY and HY were heated to the temperatures indicated under a dry Nz 
flow, using a vertically placed, fused quartz reactor equipped with a fritted 
disk. Subsequently, a mixture of ethylene gas, Nz and steam was passed 
through the zeolite bed for 1 hour. The amount of carbon deposited at 

each temperature was measured by thermogravimetry. At 600 °C, the 
La*+-ion-exchanged zeolite had been deposited with 20 times more carbon 
than had HY or NaY. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 0:35 
0.30 
2 025 
3 
N 
x) 
% 0.20 
2 
v2) 
g 
£ 015 
8 
<7 
8 
= «0.10 
cS) 
0.05 
0 
0 50 100 150 200 250 
Time (minutes) 
Extended Data Figure 2 | Carbon deposition in La**-ion-exchanged content in LaY becomes saturated at ~0.3 g g! of zeolite. b, A TEM image 
FAU zeolite at 600°C. a, The amount of carbon was measured as a of LaY zeolite after 250 min of carbon deposition, showing apparently no 
function of time, using thermogravimetric analysis equipment built in carbon deposition on external surfaces. 


the carbon deposition rig. The plotted result indicates that the carbon 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Carbon formed at 600 °C 


* 12 kHz * 


Final carbon product 


250 200 150 100 50 0 
Chemical shift (ppm) 


Extended Data Figure 3 | Magic-angle spinning solid-state '*C NMR 
spectra of the carbon framework formed within zeolite LaY. The NVR 
spectra were recorded with various spinning rates on a Brucker Avance 
III HD 400WB NMR spectrometer operated at 100.61 MHz for '3C. All 
spectra were obtained with a 4-1s pulse, a 10-s relaxation delay, and 1,000 
acquisitions. Asterisks indicate spinning sidebands for a given spinning 
rate. The spectra for carbon obtained at 600°C exhibit two peaks with 
chemical shift at 123 p.p.m. and 129 p.p.m. The peak at 123 p.p.m. can 

be assigned to six-membered-ring sp* carbon; the peak at 129 p.p.m. can 
be attributed to five- or seven-membered rings that have smaller C-C-C 
angles in the conjugated sp” carbon system”. No other peaks (assignable 
to sp? or sp carbons) were detected in the NMR spectra of sample prepared 
with 99% '°C-isotope-enriched ethylene. The final carbon product, 
liberated from zeolite after heat treatment at 850 °C, has an additional 
weak peak at around 180 p.p.m., corresponding to oxygen functional 
groups. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Zeolite 


Extended Data Figure 4 | X-ray crystallographic analysis of the carbon 
structure formed in a single crystal of La**-ion-exchanged zeolite 
FAU. Carbon atomic positions were determined through least-square 
refinement of the distances, using a difference Fourier method 

(see Methods for details). To cope with a complex system having high 
static disorder of atomic positions, we assumed that all carbon atoms had 


Na La 


Extra-framework oxygen 


the same thermal parameter in the refinement procedure. The refinement 
result indicates that atomic positions in pore necks (yellow rectangle) 
have high static disorders over a zeolite crystal. That is, the determined 
positions can be regarded as overlapped carbon positions over many 
identical pore necks. This result, using constraints, may not yet provide an 
accurate structural solution. 


© 2016 Macmillan Publishers Limited. All rights reserved 


Beta-templated 
carbon 


EMT-templated 
carbon 


FAU-templated 
carbon 


Conventional carbon 
3 replica of zeolite beta 


Derivative weight loss (% °C-1) 
& 


2 
Commercial 
graphene 
4 
Mesoporous carbon 
0 


400 450 500 550 600 650 700 


Temperature (°C) 


Extended Data Figure 5 | Thermal stability of carbon samples. 
The top three curves are derivative thermogravimetric curves for the 


carbons synthesized using different lanthanum-ion-exchanged zeolites. 


Thermogravimetry was carried out by increasing the temperature to 


700°C, with a ramping rate of 3°C min“ !, under air flow (60 ml min~'). 


We compared these thermogravimetric data with the results obtained 
using the mesoporous carbon CMK-3 (which has an amorphous 
structure), a commercial graphene product (purchased from Graphene 
Laboratories Inc.), and a beta-zeolite-templated carbon sample that 
was prepared following a two-step carbonization method? (bottom 
three curves). These data indicate that carbon samples obtained from 
lanthanum-ion-exchanged zeolites can have distinctively high thermal 


stability, compared with amorphous carbons. Notably, the beta-templated 


carbon exhibited high thermal stability in air, like graphene. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


LETTER 


LaY 
templated 


carbon 


Intensity (arbitrary units) 


Graphite 


1000 1200 1400 1600 1800 


Raman shift (cm-’) 


Extended Data Figure 6 | Raman spectra of LaY-templated carbon and 
graphite. The spectra were recorded on a Horiba Jobin Yvon ARAMIS 
spectrometer with a laser excitation wavelength of 514nm. The G- and 
D-bands are located at 1,598 cm! and 1,341 cm“, respectively. The 
G-band of LaY-templated carbon appears at a higher wavenumber than 
that of graphite; such a strong upshift indicates nanosized single graphene 
layers”°. The broad D-band is attributed to bond disorder, for instance 
because of the presence of five- or seven-membered carbon rings in the 
curved carbon structure"). 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Au (111) 


Mesoporous 
carbon 


650 + 500 


LaY-templated 
carbon 


520 350 


Extended Data Figure 7 | Topographical images of LaY-templated 
carbon and CMK-3 mesoporous carbon on an Au (111) substrate. 

a, LaY-templated carbon; b, CMK-3 mesoporous carbon. The current- 
voltage curves shown in Fig. 2f were measured on the cross-marked areas. 
The images were taken using an Agilent 5500 atomic force microscope in 
air, using a Pt/Ir-coated tip (see Methods). 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Figure 8 | Effect of ion exchange on carbon synthesis 
using the 1D-channel LTL zeolite. a, SEM image of LTL zeolite. b, SEM 
image of carbon liberated from La*+-ion-exchanged LTL zeolite. c, SEM 
image of carbon from Ht-ion-exchanged LTL zeolite. For carbon synthesis 
using La**-ion-exchanged zeolite, acetylene gas was used as the carbon 
source at 500°C; the remainder of the protocol is as described in the 
Methods. For the H*-ion-exchanged zeolite, carbon deposition was tested 
at various temperatures between 500 °C and 700°C. However, synthesis 
using this Ht zeolite failed. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Beta 


Intensity (arbitrary units) 


26 (degrees) 


Extended Data Figure 9 | Scaling up the carbon-deposition process. products synthesized from zeolites FAU (b), EMT (c) and beta (d). 


a, Photograph of the carbonization rig for large-batch synthesis; inset, e, XRD patterns of the carbons, confirming their highly ordered structure. 
the plug-flow reactor filled with a thick bed of carbon-zeolite composite These results indicate that the product quality from the 10-g batch 
(about 40 g). From this apparatus, we could obtain about 10 g of batch synthesis is the same as that from the 0.15-g batch. 


carbon products in a single preparation. b-d, TEM images of the carbon 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 1 | Data collection and refinement statistics for X-ray diffraction analysis 


Formula 

Formula weight 

Temperature 

Wavelength 

Crystal system 

Space group 

Unit cell dimensions 

Volume 

Zz 

Density (calculated) 
Absorption coefficient 

F(000) 

Crystal size 

28 range in data collection 
Index range 

Reflections collected 
Completeness to theta = 74.5° 
Independent reflections (obs/all) 
Refinement method 

Data / restraints / parameters 
Goodness-of-fit on F2 

Final R indices [l>3sigma(|)] 
R indices (all data) 

Largest diff. peak and hole 


|C208.03L@23,52N@15.50027 a7l[T(Si,Al)Ozlio2 
18094.3 

123(2) K 

Cu Ka (A=0.15418 nm ) 

Cubic 

Fd3m 

a= 25.0433 A 

15706.33 A’ 

1 

1.913 Mg/m? 

16.682 mm 

8737 

0.035 x 0.035 x 0.035 mm’ 
149° 

-27 <h<30,-15<k<31, -19<1< 25 
7829 [Rin(Obs/all) = 4.73/5.61] 
99.9% 

547/819 

Full-matrix least-squares on F2 
819/0/71 

1.90 

R, = 0.0540, wR, =0.1365 

R, = 0.0789, wR, =0.1436 
0.50 and -0.43 e AS 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


doi:10.1038/nature18010 


Design of a hyperstable 60-subunit protein 


icosahedron 


Yang Hsia!-**, Jacob B. Bale+?+*, Shane Gonen?*, Dan Shi°, William Sheffler'’, Kimberly K. Fong', Una Nattermann’*”, 
Chunfu Xu), Po-Ssu Huang!?, Rashmi Ravichandran!, Sue Yi!?, Trisha N. Davis'!, Tamir Gonen’, Neil P. King! & 


David Baker!2® 


The icosahedron is the largest of the Platonic solids, and 
icosahedral protein structures are widely used in biological systems 
for packaging and transport!”. There has been considerable interest 
in repurposing such structures*° for applications ranging from 
targeted delivery to multivalent immunogen presentation. The 
ability to design proteins that self-assemble into precisely specified, 
highly ordered icosahedral structures would open the door to a new 
generation of protein containers with properties custom-tailored 
to specific applications. Here we describe the computational 
design of a 25-nanometre icosahedral nanocage that self-assembles 
from trimeric protein building blocks. The designed protein was 
produced in Escherichia coli, and found by electron microscopy to 
assemble into a homogenous population of icosahedral particles 
nearly identical to the design model. The particles are stable in 
6.7 molar guanidine hydrochloride at up to 80 degrees Celsius, and 
undergo extremely abrupt, but reversible, disassembly between 
2 molar and 2.25 molar guanidinium thiocyanate. The icosahedron 
is robust to genetic fusions: one or two copies of green fluorescent 
protein (GFP) can be fused to each of the 60 subunits to create 
highly fluorescent ‘standard candles’ for use in light microscopy, 
and a designed protein pentamer can be placed in the centre of each 
of the 20 pentameric faces to modulate the size of the entrance/ 
exit channels of the cage. Such robust and customizable nanocages 
should have considerable utility in targeted drug delivery®, vaccine 
design’ and synthetic biology’. 

Programming protein subunits to self-assemble into well defined 
complexes is a promising route to the custom design of macromo- 
lecular machines. Protein assemblies have been engineered using 
metals”, disulfide bonds!!-"4, genetic fusions!”>-!’, and ideal helix 
helix interactions!!!*'°, but these approaches have generally yielded 
polydisperse or unanticipated products. Recently, symmetric model- 
ling coupled with computational protein-protein interface design has 
accurately generated protein assemblies with tetrahedral and octa- 
hedral symmetry'*”°, but these relatively small (<16nm diameter) 
nanocages have limited use for packaging or delivery applications 
because they have little internal volume. 

Icosahedral point group symmetry contains two-, three-, and 
five-fold axes of rotation (Fig. la). To generate novel icosahedral 
protein assemblies, trimeric protein scaffolds of known structure 
were arranged with icosahedral symmetry (the three-fold axes of the 
trimers aligned with the three-fold axes of an icosahedron) and the 
two remaining degrees of freedom—the distance r from the icosa- 
hedron centre to the centre of mass of each trimer, and the angle w 
of rotation of each trimer about its axis—were optimized for close 
packing without steric clashes (Fig. 1b, c). The amino acid sequences 
at the newly formed interfaces between the trimer building blocks 


were then optimized using RosettaDesign”*”!, and 17 designs were 


selected for experimental characterization on the basis of properties of 
the designed interface, including shape complementarity”, predicted 
binding energy, and the number of buried unsatisfied hydrogen-bond 
donors and acceptors (see Methods). 

Genes encoding the designs were assembled from oligonucleotides 
and cloned into the pET29b-+ vector for expression in E. coli. Most 
of the designs were found in the insoluble fraction upon cell lysis; of 
the three soluble designs, two (both based on a KDPG aldolase?>”*) 
showed substantial shifts in migration relative to the wild-type scaffold 
when analysed by native (non-denaturing) polyacrylamide gel electro- 
phoresis (PAGE), suggesting higher-order assembly. We selected the 
one with fewer mutations, 13-01, for further analysis. Five substitutions 
(E26K, E33L, K61M, D187V and R190A) were made to generate the 
designed interface between trimers (Fig. 1d; the amino acid sequences 
are provided in the Supplementary Information). 

13-01 was purified using immobilized metal affinity and size exclu- 
sion chromatography (SEC), yielding a single peak with an apparent 
molecular weight much larger than that of the wild-type trimeric 
protein and consistent with the expected elution volume for the 
60-subunit assembly (Fig. le). A mutant bearing a leucine-to-arginine 
substitution predicted to disrupt the designed interface eliminated 
the high-molecular-weight species and returned the elution volume 
to that of the wild-type scaffold (Fig. le). Dynamic light scattering 
(DLS) measurements of 13-01 showed a monodisperse population of 
particles with a hydrodynamic radius of 14nm, consistent with the 
design model (Fig. 1f). No disassembly to the trimeric building block 
was observed at 80°C or, remarkably, in 6.7 M guanidine hydro- 
chloride (GuHCl) (Extended Data Fig. 1). This hyperstability is a 
property of both the trimeric scaffold from which I3-01 was derived 
and of the designed interface: both are completely resistant to 
GuHCl] denaturation. An exceptionally sharp disassociation into the 
constituent trimers was observed between 2 M and 2.25 M guanidinium 
thiocyanate (GITC): at 2M the dominant species is the icosahe- 
dron, while at 2.25 M only the trimeric building block is observed 
(Fig. 1g, Extended Data Fig. 2). Importantly for cargo packaging 
applications, the disassociation is fully reversible: the hydrodynamic 
radius of particles formed by diluting disassembled protein in 3 M 
GITC down to 1 M GITC is identical to those originally produced in 
E. coli (Fig. 1h). 

We investigated the structure of 13-01 using cryo-electron micros- 
copy (cryo-EM). The individual particles in large fields of view were 
homogenous in size and shape (Fig. 2a), and in class averages from 
6,461 particles, the three projections along the symmetry axes and the 
overall icosahedral architecture are clearly discernible (Fig. 2b, c). A 
three-dimensional model calculated from the cryo-EM data matches 


1Department of Biochemistry, University of Washington, Seattle, Washington 98195, USA. Institute for Protein Design, University of Washington, Seattle, Washington 98195, USA. 2Graduate 
Program in Biological Physics, Structure and Design, University of Washington, Seattle, Washington 98195, USA. 4Graduate Program in Molecular and Cellular Biology, University of Washington, 
Seattle, Washington 98195, USA. Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, Virginia 20147, USA. "Howard Hughes Medical Institute, University of Washington, 


Seattle, Washington 98195, USA. 
*These authors contributed equally to this work. 


00 MONTH 2016 | VOL 000 | NATURE | 1 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 
d e 
1} 13-01 
0 @ | 13-01(L33R) 
gc | 1wa3 
— 
GO 
ES 
24 
10 15 20 
Elution volume (ml) 
f g h 
30 25 
13-01, 2M GITC 15 13-01 1M GITC 
25} 13-01, 6.7M GuHCl Pa 1wa3 20 3M GITC 
Ss 13-01, TBS - ge. iF ee Ss 1M GITC (return) 
S 20} 1wa3, TBS a5 x 
> S £10 > 15 
a 15 3 g 46 
®o eo! 4 oO 
= S85 = 
5 aE e ; 5 
0 0 
0.1 AO! 3.5 14 100 9 IN QP VP? se) 0.1 10 35 14 100 


Hydrodynamic radius (nm) 


Figure 1 | Design methodology and biochemical characterization. 

a, b, Icosahedral three-fold axis in red and aligned trimeric building block 
in green. c, Optimization of r and w yields closely opposed interfaces 
between subunits. d, Sequence design yields low-energy interfaces; in the 
13-01 case, composed of five designed residues (thick representations) and 
two native residues (thin representations). e, 13-01 appears larger by SEC 
than the similarly sized 13-01(L33R) and wild-type trimer (1wa3). f, DLS 


the 13-01 design model very well with a correlation coefficient of 0.92 
at 20 A and 1.5q (Fig. 2d, e), clearly indicating that 13-01 forms the 
designed structure: an icosahedron with a diameter of 25nm and an 
interior volume of approximately 3,000 nm’, values that are within the 
range of those observed in small viral capsids”. 


Two-fold 


Projection 


Class average 


Figure 2 | Cryo-EM. a, Field-of-view cryo-EM micrograph showing 
homogeneous icosahedral particles in various orientations. b, Back- 
projections of 13-01 from the design model. c, Cryo-EM class averages 


2 | NATURE | VOL 000 | 00 MONTH 2016 


GITC concentration (M) 


Hydrodynamic radius (nm) 


measurement of hydrodynamic radius (note logarithmic scale in f and h) 
of 1wa3 (3.5nm) and 13-01 (14nm). 13-01 remains assembled in 6.7 M 
GuHCl and in 2M GITC. g, Extremely sharp disassociation to trimeric 
building blocks at 2.25 M GITC. Data points represent independent 
measurements. h, [3-01 icosahedron disassembles into the trimeric 
building blocks at 3M GITC, and reassembles following dilution to 1 M. 


To probe the robustness of 13-01 to genetic fusion, we fused super- 
folder GEP (sfGFP)** to one or both termini of the monomeric subunit 
and produced the resulting proteins in E. coli. SEC analysis showed 
that the fusion proteins had hydrodynamic radii consistent with cage 
formation (Extended Data Fig. 3). Analysis of 13-01 with a carboxy 


Three-fold Five-fold 


closely match the design projections along all three symmetry axes. 
d, e, The calculated initial, unrefined density (blue, 3.220) closely matches 
the design model (green). 


© 2016 Macmillan Publishers Limited. All rights reserved 


13-01 (ctGFP) 


13-01 


Figure 3 | Tuning nanocage structure and function with genetic fusions. 
a, The left panel shows a cryo-EM micrograph of 13-01(ctGFP); 

the top right panel shows a computational model with sfGFP in green; 

the bottom right panel shows the class average along the five-fold axis. 

b, Fluorescence microscopy fields of view. c, Fluorescence intensity 
histograms. AFU, arbitrary fluorescence units; + standard deviation. 

d, Correlation between the mean fluorescence intensity and sfGFP 


(C)-terminal sfGFP fusion—called 13-01(ctGFP)—by cryo-EM 
revealed icosahedral particles with overall shapes very similar to those 
of the original design. Class averages of 13,297 particles revealed con- 
siderable internal density compared to the original I3-01 averages, 
consistent with computational models of the fusion complex (Fig. 3a). 
The I3-01 sfGFP fusions are robust to denaturation of the amino (N)- 
or C-terminal fused sfGFP in GuHCl; the particles remain assembled 
as GFP signal is lost?” (Extended Data Fig. 4). 

It is at present challenging to infer subunit copy number in GFP- 
tagged assemblies from their fluorescence intensity. What is needed 
are ‘standard candles’ with known fluorescent protein copy num- 
bers that can be used to correlate fluorescence intensity to copy 
number. To complement the icosahedra with 60 and 120 copies of 
sfGFP described above, we fused sfGFP to one or both components 
of a previously described two-component tetrahedron (T33-21; 
ref. 19) to generate assemblies with 12 or 24 copies of sfGFP (Extended 
Data Fig. 3). Intensity histograms obtained for each of the sfGFP- 
nanocage constructs using widefield fluorescence microscopy were well 
fitted with Gaussians (Fig. 3b, c), and the mean fluorescence intensity 
for each cage was found to be linearly proportional (r? =0.9925) 
to sfGFP copy number (Fig. 3d). The fluorescent properties of the 
particles were readily manipulated by substituting sfGFP with mTur- 
qoise2 and sYFP2 (Extended Data Fig. 5). In addition to serving 
as genetically encoded, water-soluble fluorescent standard candles, 
the fluorescent protein cage fusions could be useful for correlative 
light and electron microscopy’ since the icosahedral shape is quite 
distinctive. 

In 13-01, the trimeric building blocks are aligned with the three-fold 
axes while the designed interface is along the icosahedral two-folds. 
To explore the possibility of symmetry-matched fusions to designed 
nanocages, we modelled a designed pentameric helical bundle”? into 
the centre of the large 9-nm pore at the five-fold axis with a C-terminal 
linker; this fusion was named 13-01(HB). Negative-stain electron 
microscopy showed monodisperse particles of the expected size and 
symmetry; the incorporation of the pentamer does not interfere with 


LETTER 


585+235 4 6,000, 


a 1.64 12mer: 

4 a 24mer: 1,520 +275 4 
= ed 6Omer: 3,060+799 7000 
g !- 120mer: 5,270+778 > 4,0004 
2 1.07 < 

2 os 3 3,000 5 
£ 0.65 F 2,000 4 
Fal 1,000 4 
8 0.24 |; 
a) o4 


T T T - T 1 
20 40 60 80 100 120 
Number of GFP molecules 


2 4 6 8 
Total AFU (103) 


f 13-01 (HB) 


copy number for nanoparticles with different numbers of fused sfGFP 
molecules. Error bars are s.e.m. (n = 3). e, f, Computational model and 
class averages along the five-fold axis of negatively stained 13-01 (e) and 
13-01(HB) (f); the helical bundle is shown in red. Weak density in the 
centre of the pentameric faces in 13-01 may reflect randomly packaged 
material. There is clear density in the centre of the pentameric faces in the 
13-01(HB) class averages consistent with the model. 


icosahedron assembly. Particle averages showed a structure similar to 
that of the original icosahedron, with additional density at the centre of 
each five-fold axis, consistent with computational models of the fusion 
protein (Fig. 3e, f). The capability of incorporating symmetry-matched 
substructures into designed nanocages offers considerable flexibil- 
ity and modularity; for example, pentamers filling otherwise open 
pentameric faces could control the release of cargo contained within 
the nanocage. 

The designed 13-01 icosahedron is exceptionally stable, robust to 
genetic fusion, and has a considerably larger internal volume than 
previously designed nanocages with well defined and prespecified 
structures'*!”!°, Enzymatic activity is retained in the assembled ico- 
sahedron (Extended Data Fig. 6), suggesting a route to custom nano- 
reactors. The ability to accurately design icosahedral protein structures 
opens the door to new approaches to vaccine generation and targeted 
drug delivery. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 12 January; accepted 13 April 2016. 
Published online 15 June 2016. 


1. Zandi, R., Reguera, D., Bruinsma, R. F., Gelbart, W. M. & Rudnick, J. Origin of 
icosahedral symmetry in viruses. Proc. Natl Acad. Sci. USA 101, 15556-15560 
(2004). 

2. Ritsert, K. et al. Studies on the lumazine synthase/riboflavin synthase complex 
of Bacillus subtilis: crystal structure analysis of reconstituted, icosahedral 
beta-subunit capsids with bound substrate analogue inhibitor at 2.4 A 
resolution. J. Mol. Biol. 253, 151-167 (1995). 

3. Howorka, S. Rationally engineering natural protein assemblies in 
nanobiotechnology. Curr. Opin. Biotechnol. 22, 485-491 (2011). 

4. Roldao, A., Mellado, M. C. M., Castilho, L. R., Carrondo, M. J. T. & Alves, P. M. 
Virus-like particles in vaccine development. Expert Rev. Vaccines 9, 1149-1176 
(2010). 

5.  Effio, C. L. & Hubbuch, J. Next generation vaccines and vectors: designing 
downstream processes for recombinant protein-based virus-like particles. 
Biotechnol. J. 10, 715-727 (2015). 


00 MONTH 2016 | VOL 000 | NATURE | 3 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


16. 


le 
18. 
19. 
20. 


21. 
22. 
23. 


Ma, Y., Nolte, R. J. M. & Cornelissen, J. J. L. M. Virus-based nanocarriers for 
drug delivery. Adv. Drug Deliv. Rev. 64, 811-825 (2012). 

Smith, M. L. et al. Modified tobacco mosaic virus particles as scaffolds for 
display of protein antigens for vaccine applications. Virology 348, 475-488 
(2006). 

Bauler, P., Huber, G., Leyh, T. & McCammon, J. A. Channeling by proximity: 
the catalytic advantages of active site colocalization using Brownian dynamics. 
J. Phys. Chem. Lett. 1, 1332-1335 (2010). 

Brodin, J. D. et al. Metal-directed, chemically tunable assembly of one-, 
two- and three-dimensional crystalline protein arrays. Nat. Chem. 4, 
375-382 (2012). 


. Der, B.S. et al. Metal-mediated affinity and orientation specificity in a 


computationally designed protein homodimer. J. Am. Chem. Soc. 134, 
375-385 (2012). 


. Fletcher, J. M. et al. Self-assembling cages from coiled-coil peptide modules. 


Science 340, 595-599 (2013). 


. Usui, K. et a/. Nanoscale elongating control of the self-assembled protein 


filament with the cysteine-introduced building blocks. Protein Sci. 18, 960-969 
(2009). 


. Raman, S., Machaidze, G., Lustig, A., Aebi, U. & Burkhard, P. Structure-based 


design of peptides that self-assemble into regular polyhedral nanoparticles. 
Nanomedicine 2, 95-102 (2006). 


. Raman, S. et a/. Design of peptide nanoparticles using simple protein 


oligomerization domains. Open Nanomed. J. 2, 15-26 (2009). 


. Sinclair, J. C., Davies, K. M., Vénien-Bryan, C. & Noble, M. E. M. Generation of 


protein lattices by fusing proteins with matching rotational symmetry. 

Nat. Nanotechnol. 6, 558-562 (2011). 

Boyle, A. L. et al. Squaring the circle in peptide assembly: from fibers to 
discrete nanostructures by de novo design. J. Am. Chem. Soc. 134, 
15457-15467 (2012). 

Lai, Y.-T. et al. Structure of a designed protein cage that self-assembles into a 
highly porous cube. Nat. Chem. 6, 1065-1071 (2014). 

King, N. P. et al. Computational design of self-assembling protein 
nanomaterials with atomic level accuracy. Science 336, 1171-1174 (2012). 
King, N. P. et al. Accurate design of co-assembling multi-component 

protein nanomaterials. Nature 510, 103-108 (2014). 

Leaver-Fay, A. et al. ROSETTA3: an object-oriented software suite for the 
simulation and design of macromolecules. Methods Enzymol. 487, 545-574 
(2011). 

DiMaio, F., Leaver-Fay, A., Bradley, P., Baker, D. & André, |. Modeling symmetric 
macromolecular structures in Rosetta3. PLoS One 6, e20450 (2011). 
Lawrence, M. C. & Colman, P. M. Shape complementarity at protein/protein 
interfaces. J. Mol. Biol. 234, 946-950 (1993). 

Griffiths, J. S. et al. Cloning, isolation and characterization of the Thermotoga 
maritima KDPG aldolase. Bioorg. Med. Chem. 10, 545-550 (2002). 


4 | NATURE | VOL 000 | 00 MONTH 2016 


24. Fullerton, S. W. B. et al. Mechanism of the class | KDPG aldolase. Bioorg. Med. 
Chem. 14, 3002-3010 (2006). 

25. Perlmutter, J. D. & Hagan, M. F. Mechanisms of virus assembly. Annu. Rev. 
Phys. Chem. 66, 217-239 (2015). 

26. Pédelacq, J.-D., Cabantous, S., Tran, T., Terwilliger, T. C. & Waldo, G. S. 
Engineering and characterization of a superfolder green fluorescent protein. 
Nat. Biotechnol. 24, 79-88 (2006). 

27. Andrews, B. T., Schoenfish, A. R., Roy, M., Waldo, G. & Jennings, P. A. The rough 
energy landscape of superfolder GFP is linked to the chromophore. J. Mol. Biol. 
373, 476-490 (2007). 

28. Cortese, K., Diaspro, A. & Tacchetti, C. Advanced correlative light/electron 
microscopy: current methods and new developments using Tokuyasu 
cryosections. J. Histochem. Cytochem. 57, 1103-1112 (2009). 

29. Huang, P-S. et al. High thermodynamic stability of parametrically designed 
helical bundles. Science 346, 481-485 (2014). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements This work was supported by the Howard Hughes Medical 
Institute (D.B. and T.G.), the JRC visitor programme (S.G.), the National Science 
Foundation CHE-1332907 (D.B.), a UW/Hutch CCSG Pilot Award NCI 5 P30 
CA015704-41 (D.B. and N.P.K.), Takeda Pharmaceutical Company (N.P.K.), 

the Bill and Melinda Gates Foundation OPP1120319 (D.B. and N.P.K.), and 

the National Institute of Health P41 GM103533 (T.N.D.). Y.H. was supported 

in part by a NIH Molecular Biology Training Grant (T32GM008268). U.N. was 
supported in part by a PHS National Research Service Award (T32GM007270) 
from NIGMS. J.B.B. was supported in part by an NSF Graduate Research 
Fellowship (DGE-0718124). We thank the Janelia Research Campus Cryo-EM 
Facility and J. de la Cruz for their assistance with the Titan Krios. 


Author Contributions J.B.B., N.P.K., and W.S. developed the computational 
design methodology. Y.H. and J.B.B. performed the design of the icosahedra. 
Y.H. performed all other unlisted experiments. S.G. and D.S. performed 

the cryo-EM experiments. K.K.F. performed the fluorescence microscopy 
experiments. U.N. performed the negative-stain electron microscopy 
experiments. C.X. provided the pentamer sequence for I3-01(HB). P.-S.H. 
created the computational methodology to model fusions to 13-01. R.R. 
produced I3-01(HB) proteins. S.Y. produced T33-21 sfGFP fusion proteins. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of the paper. 
Correspondence and requests for materials should be addressed to 

D.B. (dabaker@uw.edu). 


© 2016 Macmillan Publishers Limited. All rights reserved 


METHODS 


Computational design. Crystal structures of 300 trimers with resolution 
better than 2.5 A and lacking long loops were selected from the Protein Data 
Bank (PDB) to use as building blocks. For each scaffold, 20 trimeric building 
blocks were arranged in icosahedral symmetry by aligning the three-fold rota- 
tional axis of each trimer with one of the three-fold icosahedral symmetry axes. 
While preserving symmetry, the building blocks were then docked together by 
enumeratively sampling their rotations (w) about the three-fold symmetry axes 
and translating (r) them into contact along the aligned axes. Configurations in 
which backbone atoms from different building blocks were less than 3.5 A apart 
were discarded. Non-clashing design models were ranked based on the number 
of pairs of 8-carbons in adjacent subunits within 12 A and further sampling was 
carried out around the top 208 docked configurations on a 0.5 A by 0.2A grid. 
Symmetric RosettaDesign”®”! calculations were then used to generate low-energy, 
symmetric hydrophobic interfaces, and the resulting designs were filtered based 
on shape complementarity” (sc), interface surface area (sasa), buried unsatisfied 
hydrogen bonds (uhb), and binding energy (ddg). Designed substitutions that 
did not substantially contribute to the interface were reverted to their original 
identities. All Rosetta scripts used are available upon request; the full 60-subunit 
design model of 13-01 is provided in the Supplementary Information. 

Cloning, screening, and protein purification. Codon-optimized genes encod- 
ing the wild-type and the designed molecules were generated by recursive poly- 
merase chain reaction (PCR) from sets of synthetic oligonucleotides (Integrated 
DNA Technologies). Five mutations were incorporated into 13-01: E26K, E33L, 
K61M, D187V and R190A. All genes were cloned into the pET29b+ plasmid with 
kanamycin resistance and expressed in BL21 Star (DE3) E. coli cells (Invitrogen) 
induced with isopropyl 5-p-1-thiogalactopyranoside (IPTG) for 4h at 37°C. Cell 
lysis was accomplished in Tris-buffered saline (TBS; 50mM Tris, 500 mM NaCl) 
with lysozyme (0.25 mg ml!) and sonication (Fisher Scientific) at 20 W for 5 min 
total ‘or’ time, using cycles of 10s on, 10s off. 

For initial screening, all constructs were labelled with the CoA-488 fluorophore 
(NEB) by the addition of AcpS*” (NEB) using an A1 peptide tag, allowing the 
solubility and assembly state of each design to be analysed using SDS-PAGE and 
native-PAGE (Bio-Rad), following procedures previously described'®. All subse- 
quent experiments were performed on either (His)s-tagged protein or remained 
untagged. 

After lysis and centrifugation at 20,000g for 30 min, the soluble fraction of 
(His)¢-tagged proteins were passed through 2 ml of nickel nitrilotriacetic acid aga- 
rose (Ni-NTA) (Qiagen), washed with 30 mM imidazole, and eluted with 500 mM 
imidazole. Pure proteins were collected after elution from a Superose 6 10/300 GL 
SEC column (GE Healthcare) at 9-11 ml, depending on the fusion variant. 

For non-(His)¢-tagged proteins, cells were lysed as above, and the cleared lysates 
were treated with serial ammonium sulfate precipitation treatments (20%, 60% 
w/v). During each step, solid ammonium sulfate was added to the lysate to the 
desired percentage, and equilibrated at room temperature for 1h. Ammonium 
sulfate precipitated protein was then collected by centrifugation at 20,000g for 
30 min at 25°C. After treatment at 60%, the pellet was then solubilized in TBS and 
heated at 80°C for 10 min. The soluble fraction was then collected and further 
purified through SEC as described. 

KDPG enzyme assay. The reaction was carried out in 25 mM HEPES, 20 mM 
NaCl buffer at pH 7.0 with the presence of NADH (0.1 mM), L-lactate dehydro- 
genase (LDH, 0.11 Ul” !), and 2-keto-3-deoxy-6-phosphogluconate (KDPG, 
1 mM) at 25°C, based on previously described methods”’. Native 1wa3, 13-01, 
or I13-01(K129A) was added at 0.02 1M final concentration to each well and 
immediately monitored for 339 nm ultraviolet absorbance over time. 

Dynamic light scattering. Purified protein was measured using a DynaPro 
NanoStar (Wyatt) DLS setup. 0.5 mg ml“! of 13-01 and 1wa3 were measured at 
25°C, then the temperature was ramped up to 90°C, then ramped back down to 
25°C for temperature scans at 2°C min !. Measurements were taken in the pres- 
ence of TBS: 25 mM Tris, 500 mM NaCl; buffered GuHCl: 25 mM Tris, 500 mM 
NaCl, 1-6.7 M GuHCl, or buffered GITC: 25 mM Tris, 500mM NaCl, 1-4M GITC. 
Different concentrations of GITC equilibrated samples were achieved by combin- 
ing stocks of 0M and 4M equilibrated solutions in different ratios while GuHCl 
equilibrated samples were equilibrated individually. Each sample was allowed 
to equilibrate in their respective buffer for at least 24h before measurement. 
Re-annealing experiments were performed by diluting 13-01 equilibrated in 3M 
GITC down to 1M GITC final concentration (0.166 mgm! protein). Data analy- 
sis was performed using DYNAMICS v7 (Wyatt), reporting regularization fits (with 
D10/D50/D90) except for temperature ramp experiments, where cumulant fits 
were used. The ~1 nm radius particle consistent with GITC buffer alone was dis- 
regarded for analysis, and monodispersity was assumed when peak polydispersity 
was below 15% (refs 31 and 32). 


LETTER 


Negative-stain electron microscopy. 311 of purified 13-01 and I3-01(ctGEFP) at 
0.1 mg ml"! were applied to glow discharged, carbon-coated 200-mesh copper 
grids (Ted Pella), washed with Milli-Q water and stained with 0.75% uranyl for- 
mate as described previously. Grids were visualized for assembly validation and 
stability and subsequently optimized for cryo-EM data collection. Screening was 
performed on a 120kV Tecnai Spirit T12 transmission electron microscope (FEI) 
with a bottom-mount TVIPS F416 CMOS 4k camera. 

611 of purified I3-01 and I3-01(HB) at 0.05-0.1 mg ml were applied to glow 

discharged, carbon-coated 400-mesh copper grids (Ted Pella), washed with Milli-Q 
water and stained with 0.75% uranyl formate. Grids were visualized for assembly 
validation and optimized for data collection. Screening and sample optimization 
was performed on a 100kV Morgagni M268 transmission electron microscope 
(FEI) equipped with an Orius charge-coupled device (CCD) camera (Gatan). 
Data collection was performed on a 120kV Tecnai G2 Spirit transmission electron 
microscope (FED). All final images were recorded using an Ultrascan 4000 4k x 4k 
CCD camera (Gatan) at 52,000 magnification at the specimen level. Coordinates 
for 6,576 13-01 and 4,131 I3-01(HB) unique particles were obtained for aver- 
aging using EMAN2™. Boxed particles were used to obtain two-dimensional 
class averages by refinement in EMAN2. Additional image analysis was performed 
using Image]*. 
Cryo-EM. 5.1 of purified untagged 13-01 and 13-01(ctGFP), diluted to 
~0.1mg ml! using TBS buffer (25mM Tris pH 8.0, 150 mM NaCl) with an addi- 
tional 2mM dithiothreitol were applied to glow discharged 1.2/1.3 Quantifoil grids, 
blotted and plunged into liquid ethane using a Vitrobot (FEI). Screening and grid 
optimization was performed on a 200kV TF20 transmission electron microscope 
(FEI) with a bottom-mount TVIPS F416 CMOS 4k camera. 4-6 s movies were 
recorded on a 300kV Titan Krios (FEI) using a Gatan K2 direct detector at either 
29,000 or 37,000 magnification at the specimen level at ~10 electrons per 
pixel per second. 

Movies were motion-corrected using previously described methods*®. 

Coordinates for 6,461 (13-01) and 13,297 (13-01(ctGFP)) unique particles were 
obtained for averaging using EMAN2™. Extracted frames of these particles 
were used to calculate class averages by refinement in IMAGIC* using mul- 
tiple rounds of multivariate statistical analysis and multi-reference alignment. 
An initial density model was calculated based on the calculated averages using 
EMAN2* and the fitting of the model and correlation were calculated using 
UCSE Chimera?*. Low-resolution (17-30 A) volumes from the I3-01 design 
model were calculated using SPIDER® and inspected in UCSF Chimera**. Back- 
projection images were computed in SPIDER® on the low-resolution volumes 
and visualized using WEB*’. The contrast of all micrographs was enhanced 
in Fiji’. 
Symmetrical linker modelling. RosettaRemodel*! was used to model 
13-01(ctGFP) and to generate linkers for 13-01(HB). For 13-01(ctGFP), 13-01 was 
held static while the linker was sampled via fragment insertion, placing the sfGFP 
molecules at the end of the linker. The overall model was sampled symmetrically 
with icosahedral symmetry. 

For 13-01(HB), 13-01 was held static while linkers of different lengths (7-12 

residues) were sampled via fragment insertion. The resulting placement of the 
helical bundle at the end of the linker was filtered with pentameric assembly 
constraints to determine linker lengths that could satisfy formation of the pen- 
tameric helical bundle. The shorter linkers that allowed unstrained helical assem- 
bly were selected for experimental testing. Example scripts are supplied in the 
Supplementary Information. 
Fluorescence microscopy. Different constructs used for fluorescence micros- 
copy were generated by genetically fusing sfGFP to the termini of nanocages. For 
133-21, the sfGFP was fused to either the C terminus of the first component (12 
sfGFP molecules), or the C terminus of both components (24 sfGFP molecules). 
For I3-01, the sfGFP was fused to either terminus of I3-01 (60 sfGFP molecules), 
or both termini of [3-01 (120 sfGFP molecules). For mTurquoise2 and SYFP2 
versions, sfGFP was replaced with the sequence of the respective fluorescent pro- 
tein bearing additional surface mutations identical to sfGFP**. 

GFP nanocages were mounted on agarose pads for microscopy as previously 
described’. Images of the GFP nanocages were obtained using a DeltaVision 
system (Applied Precision) with an IX70 inverted microscope (Olympus), a U Plan 
Apo 100 objective (1.35 NA) and a CoolSnap HQ digital camera (Photometrics). 
GFP images were taken with a 0.4s exposure, in a single focal plane, and binned 
1xi. 

The fluorescence intensities of GFP puncta were identified and quantified using 
custom Matlab programs as previously described‘; programs are available upon 
request. Fluorescent intensity histograms of individual sfGFP-fused cages were 
fitted with Gaussian distributions, shown with mean total arbitrary fluorescence 
unit (AFU) intensity + one standard deviation. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


30. 


31. 
32. 
33. 


34. 
35. 
36. 


Zhou, Z. et al. Genetically encoded short peptide tags for orthogonal protein 
labeling by Sfp and AcpS phosphopantetheiny| transferases. ACS Chem. Biol. 
2, 337-346 (2007). 
Baalousha, M. & Lead, J. R. Nanoparticle dispersity in toxicology. 

Nat. Nanotechnol. 8, 308-309 (2013). 

Zulauf, M. & D’Arcy, A. Light scattering of proteins as a criterion for 
crystallization. J. Cryst. Growth 122, 102-106 (1992). 

Nannenga, B. L., ladanza, M. G., Vollmar, B. S. & Gonen, T. in Current Protocols 
in Protein Science (eds Coligan, J. E., Dunn, B. M., Speicher, D. W. & 
Wingfield, P. T.) Ch. 17.15 Vohn Wiley & Sons, 2013). 

Tang, G. et al. EMAN2: an extensible image processing suite for electron 
microscopy. J. Struct. Biol. 157, 38-46 (2007). 

Schneider, C. A., Rasband, W. S. & Eliceiri, K. W. NIH Image to ImageJ: 25 years 
of image analysis. Nat. Methods 9, 671-675 (2012). 

Li, X. et al. Electron counting and beam-induced motion correction enable 
near-atomic-resolution single-particle cryo-EM. Nat. Methods 10, 584-590 
(2013). 


37. 


38. 
39. 


40. 


41. 


42. 
43. 


van Heel, M., Harauz, G., Orlova, E. V., Schmidt, R. & Schatz, M. A new 
generation of the IMAGIC image processing system. J. Struct. Biol. 116, 17-24 
(1996). 

Pettersen, E. F. et al. UCSF Chimera-a visualization system for exploratory 
research and analysis. J. Comput. Chem. 25, 1605-1612 (2004). 

Frank, J. et al. SPIDER and WEB: processing and visualization of images in 
3D electron microscopy and related fields. J. Struct. Biol. 116, 190-199 
(1996). 
Schindelin, J. et al. Fiji: an open-source platform for biological-image analysis. 
Nat. Methods 9, 676-682 (2012). 
Huang, P-S. et a/. RosettaRemodel: a generalized framework for flexible 
backbone protein design. PLoS One 6, e24109 (2011). 

Muller, E. G. D. et al. The organization of the core proteins of the yeast spindle 
pole body. Mol. Biol. Cell 16, 3341-3352 (2005). 

Shimogawa, M. M., Wargacki, M. M., Muller, E. G. & Davis, T. N. Laterally 
attached kinetochores recruit the checkpoint protein Bub1, but satisfy the 
spindle checkpoint. Cell Cycle 9, 3619-3628 (2010). 


© 2016 Macmillan Publishers Limited. All rights reserved 


Radius (nm) 


a) TBS 


LETTER 


b) 6.7M GuHCl c) 2M GITC 
25 25 > : 25 
20 20 20 
15 15 - 15 
10 . |{to -=— 1 || 10 
20 40 #460 #80 100 20 40 #+460 #480 100 20 40 #460 #80 100 


Temperature (°C) 


Extended Data Figure 1 | 13-01 tolerance to temperature. DLS measurements as [3-01 is subjected to heating to 90 °C (solid line), then cooling to 
25°C (dotted line) in TBS (a), 6.7 M GuHCl (b) and 2M GITC (c). Under all three conditions, any indications of aggregation or increase in size due to 


temperature appear to be completely reversible. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


35 
30 
25 
20 


15 


% Intensity, offset 


10 


Radius (nm), Logarithmic 


Extended Data Figure 2 | Reproducibility of I3-01 transition in 2 M to 2.25 M GITC. Four examples each of independent measurements at 2 M (blue) 
and 2.25 M (red) GITC using DLS show the reproducibility of the cage disassociation. Histograms are plotted offset by 1% intensity from each other 
for clarity. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


1.0 


0.5 


Normalized A230 (mAU) 


0.0 


Elution Volume (mL) 


Extended Data Figure 3 | SEC of T33-21 and 13-01 fused with sfGFP. is expected to extend mostly outward from the icosahedron, thus greatly 
Size exclusion chromatography traces for T33-21 (12mer in red and 24mer _ increasing the hydrodynamic radius while the C-terminal fusion is 

in blue) and 13-01 (60mer in green and 120mer in purple) sfGFP fusions, predicted to occupy the internal void space. A230, ultraviolet absorbance 
display increased particle sizes with increasing copies of GFP, but retain at 230nm; mAU, milli-absorbance units. 


monodispersed populations. The N-terminal fusion of sfGFP (dashed line) 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


13-01(ntGFP) 13-01(ctGFP) 


-_ 
E 
= 
o 3 
+ a=] 
Pi g 
we) 
iS ¥ 
g 0.02 [= 
ra © 
: 3 
2 
< 8 
Uv 
0.00 =] 
<= 
-_ 
< 
7) 
G 
a) 
& 
‘0 1 “102 102 0) 1 2 3 
10 10 10 10 10 10 10 10 
Hydrodynamic Radius (nm) Hydrodynamic Radius (nm) 
Extended Data Figure 4 | Tolerance of 13-01-sfGFP fusions to GuHCl. dotted line and dots) reveal as sfGFP unfolds, the hydrodynamic radius 
N-terminal (red) and C-terminal (blue) sfGFP fusions were equilibrated increases slightly, and then stabilizes. The bottom panels show that in 
to 0-6.4M GuHCl. Ultraviolet absorbance at 490 nm (A490) monitors the 1M GuHCl (solid line) and in 6M GuHCl (dotted line), the icosahedral 
unfolding of sfGFP (top, solid line and crosses). DLS experiments (top, assemblies remain relatively monodisperse. 


© 2016 Macmillan Publishers Limited. All rights reserved 


uj RESEARCH 


Extended Data Figure 5 | 13-01 C-terminal fusions with other fluorescent proteins. Fluorescent proteins mTurquoise2 (in blue) or sYFP2 (in green) 
were fused to the C terminus of I3-01. The field of view using widefield fluorescence microscopy shows distinct signals of each type when the two types 
are mixed together. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Blank 

1wa3 

13-01 
13-01(K129A) 


Absorbance (UV339) 


a 100 200 
Time (Seconds) 


Extended Data Figure 6 | 13-01 retains native enzyme activity. Coupled KDPG aldolase assay showing native-like enzymatic activity in 13-01. 
The K129A knockout shows no enzyme activity, similar to buffer alone. UV339, absorbance at 339 nm; error bars are standard deviation. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


doi:10.1038/nature17992 


Subduction controls the distribution and 
fragmentation of Earth’s tectonic plates 


Claire Mallard!, Nicolas Coltice’?, Maria Seton®, R. Dietmar Miiller? & Paul J. Tackley* 


The theory of plate tectonics describes how the surface of Earth is 
split into an organized jigsaw of seven large plates’ of similar sizes 
and a population of smaller plates whose areas follow a fractal 
distribution’. The reconstruction of global tectonics during the 
past 200 million years‘ suggests that this layout is probably a long- 
term feature of Earth, but the forces governing it are unknown. 
Previous studies*””*, primarily based on the statistical properties of 
plate distributions, were unable to resolve how the size of the plates is 
determined by the properties of the lithosphere and the underlying 
mantle convection. Here we demonstrate that the plate layout of Earth 
is produced by a dynamic feedback between mantle convection and 
the strength of the lithosphere. Using three-dimensional spherical 
models of mantle convection that self-consistently produce the 
plate size-frequency distribution observed for Earth, we show that 
subduction geometry drives the tectonic fragmentation that generates 
plates. The spacing between the slabs controls the layout of large 
plates, and the stresses caused by the bending of trenches break plates 
into smaller fragments. Our results explain why the fast evolution in 
small back-arc plates”* reflects the marked changes in plate motions 
during times of major reorganizations. Our study opens the way to 
using convection simulations with plate-like behaviour to unravel how 
global tectonics and mantle convection are dynamically connected. 


a = re) 
g 3 
3 co} 
o'"gs 1 3 
a o 
1074 S 0.75 2 
fo} n 
40° i 0.5 z 
=: 
10-24 3 0.25 3 
fo} (0) 
Q. 0 3 
g 
f Surface © 
P= 
a 
oO 
a 


102 & 

10+ Fl 

10° & 

108 8 

CMB 10-0 5 
2 6 10 14 


Spherical harmonic 
degree 


Figure 1 | Snapshots of convection calculations and of Earth with 
associated spectral heterogeneity maps of the temperature field and 
seismic velocity field. The spectral heterogeneity maps are normalized 
by the value of the highest power. a, The convection solution with a yield 
stress of 100 MPa contains a large number of plate boundaries. f, The 
corresponding spherical harmonic map is dominated by degree 6 in the 
shallow boundary layer. b, The convection solution with a yield stress of 
150 MPa has fewer plate boundaries and a decreasing number of slabs. 

g, The corresponding spherical harmonic map is dominated by degree 4 at 


The outer shell of Earth comprises an interlocking mosaic of 52 tec- 
tonic plates”. Among these plates, two groups can be distinguished: 
a group of seven large plates of similar area covering up to 94% of 
the planet and a group of smaller plates whose areas follow a fractal 
distribution”?. The presence of these two statistically distinct groups 
was previously proposed to reflect two distinct evolutionary laws: the 
large size group being tied to mantle flow and the other to lithosphere 
dynamics’. In contrast, others studies*® have suggested that this plate 
layout is produced by superficial processes, because the larger plates 
may also fit a fractal distribution. Resolving this controversy has been 
limited by the exclusive use of statistical tools, which do not provide an 
understanding of the underlying forces and physical principles behind 
the organization of the plate system. 

Here, we use 3D spherical models of mantle convection to uncover 
the geodynamical processes that drive the tessellation of tectonic plates. 
Our dynamic models combine pseudo-plasticity and large variations 
in viscosity (Fig. 1; see Methods), which generate a plate-like behav- 
iour self-consistently® ', including fundamental features of sea-floor 
spreading'”. In our models, pseudo-plasticity is implemented through 
a yield stress that represents a plastic limit at which the viscosity drops 
and strain localization occurs, producing the equivalent of plate bound- 
aries. The value of the yield stress is a measure of the stress at plate 


(%) uolyeeA AjDOJBA JeBYS 


JEMOd Pszi|eWON 


10° 


the surface. c, The convection solution with a yield stress of 200 MPa has 
even fewer plate boundaries. h, The corresponding spherical harmonic 
map is dominated by degree 4 at the surface. d, The convection solution 
with a yield stress of 250 MPa has a surface that is barely deformed. 

i, The corresponding spherical harmonic map is blue and dominated 

by degree 2. e, ETOPO1” global relief model of Earth and a cross- 
section through S-wave tomographic model SEMUCB-WM1*”. j, The 
corresponding spherical harmonic map of the tomographic model is 
dominated by degrees 4-5 at the surface. CMB, core-mantle boundary. 


lLaboratoire de Géologie de Lyon, Ecole Normale Supérieure, UMR 5276 CNRS, Université de Lyon 1, 69622 Villeurbanne, France. 2Institut Universitaire de France, 103 Boulevard Saint Michel, 
75005 Paris, France. ?EarthByte Group, School of Geosciences, Madsen Building F09, University of Sydney, New South Wales 2006, Australia. “Institute of Geophysics, Department of Earth 


Sciences, ETH Ziirich, Sonneggstrasse 5, 8092 Zurich, Switzerland. 


00 MONTH 2016 | VOL 000 | NATURE | 1 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a b 
4 me Smaller | Larger 
18 16 plates 'plates 
1.6 ; 
14 1.4 
1.2 
ie 1.0 
= 41.0 : 
5S 08 0.8 
8 06 0.6 
o 0.4} —100MPa 0.4/ —150 MPa 
& 0.2; — Earth (ref. 2) 0.2; — Earth (ref. 2) 
a 0 0 T 
2 78 
ec : 
= 20 pce on 4 2.0 
2 18 maller arger 18 
7 16 plates plates 16 
“oS 1.4 ' 1.4 
D> 12 | 1.2 
ee F°) \ 1.0 
0.8 0.8 
0.6 ! 0.6 
0.4} — 200 MPa H 0.4! —250 MPa 
0.2} — Earth (ref. 2) \ 0.2; — Earth (ref. 2) 
) ) 
4 45 5 55 6 65 7 75 8 8.5 4 45 5 55 6 65 7 75 8 85 


log, [Plate size (km?)] 7.59 


Figure 2 | Plots of the logarithm of cumulative plate count versus 

the logarithm of plate size for four yield stress values and Earth. The 
cumulative plate count represents the number of plates that exceed a given 
area. The graphs contain three data sets for a yield stress of 100 MPa or 
five data sets for other yield stress values, and the data set for Earth?, 

in which the distinction between small plates and large plates (indicated 
by the vertical dashed lines) is around 107° km? (39,800,000 km’). a, Graph 
for models with a yield stress of 100 MPa, showing a distribution of small 
and medium plates. b, Graph for models with a yield stress of 150 MPa, 


boundaries and does not necessarily correspond to experimental val- 
ues. We determine the yield stress range that allows plate-like behav- 
iour, as in previous studies!*-!5. For our convection parameterization, 
this range exists between 100 MPa, below which surface deformation 
is very diffuse, and 350 MPa, above which the surface consists of a 
stagnant lid. We analyse the plate pattern of models with yield stresses 
of 100 MPa (model 1), 150 MPa (model 2), 200 MPa (model 3) and 
250 MPa (model 4) (see Fig. 1). Typically, 90% of the deformation is 
concentrated in less than 15% of the surface in our models. 

Convection modelling generates continuous fields. As a conse- 
quence, we have to use plate tectonics rules to delineate the layouts of 
plates that self-consistently emerge in our dynamical solutions. We digi- 
tize plate boundaries on several snapshots for each yield stress value. To 
be sure that we study snapshots that are substantially different and not 
correlated, we pick snapshots separated by more than 100 Myr (ref. 16). 
We choose three snapshots for model 1 and five snapshots for every 
other model (see Methods). We manually build plate polygons using 
GPlates!” through a careful analysis of the surface velocity, horizontal 
divergence, viscosity, synthetic sea-floor age and temperature field for 
each snapshot (see Methods, Extended Data Figs 1 and 2). From this 
we extract the cumulative number versus area distribution of plates for 
each convection snapshot (Fig. 2). 

In model 1 (Fig. 2a), there are more than a hundred plates distributed 
along a smooth curve. The smallest plate has a size similar to the Easter 
microplate and the largest one is smaller than the South American plate, 
which is notably smaller than Earth's larger plates. In contrast, the largest 
plate for model 4 is larger than the Pacific plate and small plates are 
absent (Fig. 2d). The snapshots of the models with intermediate yield 
stresses (models 2 and 3) display the same distributions of plate sizes as 
observed on Earth (Fig. 2b, c, Extended Data Fig. 3). For a yield stress 
of 150 MPa (Fig. 2b), the smallest plate is the equivalent of the South 
Sandwich microplate and the size of the largest is between the areas 
of the North American plate and the Pacific plate. For a yield stress of 
200 MPa (Fig. 2d), the smallest plate is slightly larger than that for a yield 
stress of 150 MPa, but the largest plate is close in area to the Pacific plate. 

Our models indicate that the maximum plate size increases with 
increasing yield stress, which itself has the effect of increasing the wave- 
length of convection'>. For the lowest yield stress value the spherical 


2 | NATURE | VOL 000 | 00 MONTH 2016 


log,,[Plate size (km?)] 


showing a distinction between distributions of the large and the small 
plates. The shift in the distribution occurs at a plate size of about 107° km? 
(63,100,000 km). c, Graph for models with a yield stress of 200 MPa, 
displaying fewer small plates; the groups of small and large plates are 
distinct and split at about 107° km? (39,800,000 km?). d, Graph for models 
with a yield stress of 250 MPa, showing only medium and large plates. 
The division between smaller and larger plates in b and c corresponds 

to the cross-over of the fitted slopes of the large and smaller plates 
(Extended Data Fig. 3). 


harmonic power spectrum of the temperature field is dominated by 
shorter wavelengths and by spherical harmonic degree 6 in the shallow 
boundary layer (Fig. 1f), which represents the existence of numerous 
subduction zones and the relatively short wavelengths of the flow in 
the mantle. For the two intermediate values of 150 MPa and 200 MPa 
(Fig. 1g, h) the spectra drift to longer wavelengths because degree 4 dom- 
inates in the shallow boundary layer, corresponding to a lower number 
of subduction zones. The maximum size of the plates is similar in both 
cases. When the yield stress increases to 250 MPa (Fig. li), degree 2 
dominates in the shallow boundary layer, corresponding to the maxi- 
mum size of the plates in all of the models. These results suggest that the 
size of the large plates follows the spacing between active downwellings. 

Previous studies on the distributions of smaller plates point to 
a fragmentation process*. We therefore focus on triple junctions, 
which are symptoms of plate fragmentation: the splitting of a plate 
into two smaller ones necessarily produces two triple junctions. Both 
the models and Earth display considerably more triple junctions on 
subduction zones than on mid-ocean ridges (106.6 versus 75.6 on 


Data: Average: 
1.6, +100MPa e100MPa 
+150MPa e150MPa a 
» 1.4, +200MPa e200MPa 
c +250MPa e250MPa + 
us BEarth 
6 1.2 
S 
- 1.0 + 
2 0.8 pe 
5 + » = 
- 0.6 
3 + + + 
e 0.4 
5 + $ " ae 
0.2 
0 
1.10 1.15 1.20 1.25 1.30 
Tortuosity 


Figure 3 | Number of triple junctions per 1,000 km of subduction zones 
versus the average tortuosity. Data are shown for four yield stress values 
and Earth (see legend). The tortuosity is the ratio of the length of the 
subduction zone to the length of the great circle between the end points. 
The error bars represent the standard deviation for each data set. 


© 2016 Macmillan Publishers Limited. All rights reserved 


-30° b 
-30° a° 


=4=3--2 =1 0 12°38 4 
log,,[Dimensionless viscosity] 

Figure 4 | Global viscosity maps of model 2 and the associated 
kinematics. a—c, Maps are separated by 10 Myr. The shapes of the large 
plates do not change much, whereas the adjustment of the small plates 
evolves quickly. d, 90 Myr after the first snapshot (a), the distribution of 
the large plates and smaller plates has evolved substantially. In a—d, the 
top panels show the viscosity of the mantle (colour scale); the bottom 


average for model 2; 131 versus 71 on Earth today), despite the fact 
that mid-ocean ridges are more elongated than trenches (total length 
of mid-ocean ridges and trenches: 79,000 km versus 66,000 km on 
average for model 2; 72,500 km versus 48,000 km on Earth today). 
Likewise, the triple junctions that are mainly composed of trench 
segments are those that involve smaller plates in higher proportions 
(Extended Data Fig. 4). Hence subduction zones focus fragmentation 
and the formation of smaller plates. On Earth, only the Galapagos, 
Easter and Juan Fernandez plates formed away from any trench or 
collisional area. 

Our calculations show that plates fragment mostly in connection 
with curved trenches. Indeed, surface velocities tend to be perpen- 
dicular to the trench where slabs sink. Therefore a bend in the trench 
corresponds to differential motion and hence high stresses. As a con- 
sequence, a concave plate under tensile stresses fragments and tri- 
ple junctions connect the trench with new ridge/transform/diffuse 


== Transform and mid-ocean ridges 
== Subduction zones 
— Diffuse boundaries 


LETTER 


~30°H 4 -30° 
-30° 0° 90° 


60° 
lm Plate area < 5.8 x 10° km? 


30° 


5.8 x 105 km? < Plate area < 45 x 10° km? 
Plate area > 45 x 10° km? 


panels show the different boundary types (coloured lines) and plate 
sizes (shading) within the boxed regions in the top panels (which focus 
on longitudes between —30° and 90° and latitudes between —30° and 
30°). The arrows indicate the direction and magnitude (represented by 
arrow length) of the mantle flow. Plate-size categories are determined in 
Extended Data Fig. 3. 


segments. This is consistent with the observed correlation between 
the tortuosity of trenches and the number of triple junctions per unit 
length of subduction (Fig. 3). Because increasing the yield stress pro- 
duces less tortuous trenches and fewer triple junctions per unit length 
of trench, smaller plate generation is also controlled by the strength of 
the lithosphere. 

The models with plate area distributions similar to Earth also have 
similar lengths of convergent boundaries to Earth, shown by compar- 
ing the trenches in our models with trenches plus mountain belts on 
Earth”. Moreover, the computed temperature heterogeneity spectra of 
the intermediate yield stress case (Fig. 1g) are consistent with tomo- 
graphic models of Earth’s mantle’® (Fig. 1j), having degree 2 dominat- 
ing in the deep mantle. However, our models include simplifications 
because of computational limitations: a lower Rayleigh number than 
on Earth (10° versus about 107), incompressibility and no chemical dif- 
ferences (no continents or deep chemical piles). The physics principles 


00 MONTH 2016 | VOL 000 | NATURE | 3 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


we propose for the plate size tessellation are not specifically depend- 
ent on the Rayleigh number’®, although the yield stress values could 
be different. Compressibility should have little impact on the surface 
tectonics because it concerns the deeper flow’. The addition of conti- 
nents, which help to generate more Earth-like area—age distributions 
of the sea floor”, should reinforce the presence of the larger plates and 
ensure large-scale flow. 

On the basis of our results, we propose that the plate pattern on Earth 
is produced by the dynamic feedback between mantle convection and 
the strength of the lithosphere. The self-organized subduction struc- 
ture defines the pattern of large and small plates through slab pull and 
suction. The large-plate system evolves over hundreds of millions of 
years through global reorganizations of mantle flow due to the initia- 
tion and shutdown of subduction (Fig. 4). This timescale is commen- 
surate with the lifetime of slabs”!. In contrast, the smaller plates in our 
models evolve on shorter timescales of tens of millions of years (Fig. 4). 
They record lateral changes in trench geometry and slab migrations”. 
The enhanced sensitivity of the smaller plates to the readjustment of 
subduction systems is consistent with present-day observations of sea- 
floor spreading in many back-arc regions. Our models also reveal that 
global and regional changes in plate motions may be more readily and 
dramatically expressed in these smaller plates than in the larger plates. 
For instance, the Parece Vela and Shikoku basins in the Philippine 
Sea plate record a major clockwise change in the spreading direction 
between 22 Myr ago and 23 Myr ago (ref. 7), at the same time that the 
larger Pacific plate records substantial plate boundary and plate motion 
changes (for example, the fragmentation of the Farallon plate”? and the 
collision of the Ontong Java Plateau with the Melanesian subduction 
zone"). In the same way, the Lau Basin in the southwest Pacific ini- 
tiated its main spreading phase by successive southward propagation 
around 4 Myr ago (ref. 8), at the same time as a change in the spreading 
direction in the northeast” and southwest Pacific*® and a major phase 
of subsidence across the Atlantic””. 

We propose that the plate layout is a property that characterizes a 
dynamic feedback between mantle convection and lithosphere strength. 
The larger plates are an expression of the dominating convection 
wavelength, and their fragmentation into smaller plates is driven by 
subduction geometry. The decreasing number of smaller plates in 
pre-Cenozoic-era tectonic reconstructions** is therefore an artifi- 
cial consequence of the diminishing quantity of preserved sea floor. 
Confirming the existence of migrating intra-oceanic subduction sys- 
tems such as in Panthalassa”* may help to correct that bias. Over longer 
geologic timescales, the size distribution of plates has certainly evolved 
with the slow cooling of Earth. Following the declining convective vig- 
our, the lithosphere gets stronger relative to mantle forces. Therefore, 
this study suggests that since plate tectonics started on Earth, it may 
have operated with fewer, larger plates as the planet has cooled down. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 16 September 2015; accepted 4 April 2016. 
Published online 15 June 2016. 


1. Le Pichon, X. Sea-floor spreading and continental drift. J. Geophys. Res. 73, 
3661-3697 (1968). 

2. Bird, P. An updated digital model of plate boundaries. Geochem. Geophys. 
Geosyst. 4, 1027 (2003). 

3. Morra, G., Seton, M., Quevedo, L. & Miller, R. D. Organization of the tectonic 
plates in the last 200 Myr. Earth Planet. Sci. Lett. 373, 93-101 (2013). 

4. Seton, M. et a/. Global continental and ocean basin reconstructions since 
200 Ma. Earth Sci. Rev. 113, 212-270 (2012). 

5. Sornette, D. & Pisarenko, V. Fractal plate tectonics. Geophys. Res. Lett. 30, 1105 
(2003). 

6. Vallianatos, F. & Sammonds, P. Is plate tectonics a case of non-extensive 
thermodynamics? Physica A 389, 4989-4993 (2010). 

7. Sdrolias, M., Roest, W. R. & Miller, R. D. An expression of Philippine Sea plate 
rotation: the Parece Vela and Shikoku Basins. Tectonophysics 394, 69-86 (2004). 

8. Taylor, B., Zellmer, K., Martinez, F. & Goodliffe, A. Sea-floor spreading in the 
Lau back-arc basin. Earth Planet. Sci. Lett. 144, 35-40 (1996). 


4 | NATURE | VOL 000 | 00 MONTH 2016 


9. Moresi, L. & Solomatov, V. Mantle convection with a brittle lithosphere: 
thoughts on the global tectonic styles of the Earth and Venus. Geophys. J. Int. 
133, 669-682 (1998). 

10. Trompert, R. & Hansen, U. Mantle convection simulations with rheologies that 
generate plate-like behaviour. Nature 395, 686-689 (1998). 

11. Tackley, P. J. Self-consistent generation of tectonic plates in time-dependent, 
three-dimensional mantle convection simulations: 1. Pseudoplastic yielding. 
Geochem.Geophys. Geosyst. 1, 1021 (2000). 

12. Coltice, N., Seton, M., Rolf, T., Miller, R. & Tackley, P. J. Convergence of 
tectonic reconstructions and mantle convection models for significant 
fluctuations in seafloor spreading. Earth Planet. Sci. Lett. 383, 92-100 
(2013). 

13. Ricard, Y., Bercovici, D. & Schubert, G. A two-phase model for compaction and 
damage: 2. Applications to compaction, deformation, and the role of interfacial 
surface tension. J. Geophys. Res. 106, 8907 (2001). 

14. Stein, C., Schmalzl, J. & Hansen, U. The effect of rheological parameters on 
plate behaviour in a self-consistent model of mantle convection. Phys. Earth 
Planet. Inter. 142, 225-255 (2004). 

15. van Heck, H. J. & Tackley, P. J. Planforms of self-consistently generated plates in 
3D spherical geometry. Geophys. Res. Lett. 35, L19312 (2008). 

16. Bello, L., Coltice, N., Rolf, T. & Tackley, P. J. On the predictability limit of 
convection models of the Earth’s mantle. Geochem. Geophys. Geosyst. 15, 
2319-2328 (2014). 

17. Williams, S. E., Muller, R. D. & Landgrebe, T. C. W. An open-source software 
environment for visualizing and refining plate tectonic reconstructions using 
high-resolution geological and geophysical data sets. GSA Today 22(4), 

4-9 (2012). 

18. Becker, T. W. & Boschi, L. A comparison of tomographic and geodynamic 

mantle models. Geochem. Geophys. Geosyst. 3, 1003 (2002). 

19. van Heck, H. J. & Tackley, P. J. Plate tectonics on super-Earths: equally or more 

ikely than on Earth. Earth Planet. Sci. Lett. 310, 252-261 (2011). 

20. Tackley, P. J. Modelling compressible mantle convection with large viscosity 

contrasts in a three-dimensional spherical shell using the yin-yang grid. 

Phys. Earth Planet. Inter. 171, 7-18 (2008). 

21. Matthews, K. J., Seton, M. & Miller, R. D. A global-scale plate 

reorganization event at 105-100 Ma. Earth Planet. Sci. Lett. 355-356, 

283-298 (2012). 

22. Stegman, D. R., Schellart, W. P. & Freeman, J. Competing influences of 

plate width and far-field boundary conditions on trench migration and 

morphology of subducted slabs in the upper mantle. Tectonophysics 483, 

46-57 (2010). 

23. Barckhausen, U., Ranero, C. R., Cande, S. C., Engels, M. & Weinrebe, W. Birth of 

an intraoceanic spreading center. Geology 36, 767-770 (2008). 

24. Petterson, M. G. et a/. Geological-tectonic framework of Solomon Islands, 

SW Pacific: crustal accretion and growth within an intra-oceanic setting. 

Tectonophysics 301, 35-60 (1999). 

25. Harbert, W. Late Neogene relative motions of the Pacific and North America 

plates. Tectonics 10, 1-15 (1991). 

26. Tebbens, S. F. & Cande, S. C. Southeast Pacific tectonic evolution from early 

Oligocene to present. J. Geophys. Res. 102(B6), 12061-12084 (1997). 

27. Cloetingh, S. A. P. L., Gradstein, F. M., Kooi, H., Grant, A. C. & Kaminski, M. M. 

Plate reorganization: a cause of rapid late Neogene subsidence and 
sedimentation around the North Atlantic? J. Geol. Soc. Lond. 147, 495-506 
(1990). 

28. van der Meer, D. G., Torsvik, T. H., Spakman, W., van Hinsbergen, D. J. J. & 
Amaru, M. L. Intra-Panthalassa Ocean subduction zones revealed by fossil 
arcs and mantle structure. Nat. Geosci. 5, 215-219 (2012). 

29. Amante, C. & Eakins, B. W. ETOPO1 1 Arc-Minute Global Relief Model: 
Procedures, Data Sources and Analysis www.ngdc.noaa.gov/mgg/global/relief/ 
ETOPO1/image/color_etopol_ice_low.tif.zip (US National Oceanic and 
Atmospheric Administration, 2009). 

30. French, S. W. & Romanowicz, B. A. Broad plumes rooted at the base of the 
Earth’s mantle beneath major hotspots. Nature 525, 95-99 (2015). 


Acknowledgements The research leading to these results was funded by the 
European Research Council within the framework of the SP2-Ideas Programme 
ERC-2013-CoG under ERC grant agreement 617588. We thank S. Durant 

and E. Debayle for helping to make Fig. 1e, i and E. J. Garnero for his inputs. 
Calculations were performed on the AUGURY supercomputer at P2CHPD Lyon. 
N.C. was supported by the Institut Universitaire de France. R.D.M and M.S are 
supported by ARC grants DP130101946 and FT130101564. 


Author Contributions C.M. developed the methodology for analysing 

the convection models, conducted the plate analysis, contributed to the 
interpretation and wrote the manuscript. N.C. conducted the convection 
calculations, contributed to the development of the methodology and analysis, 
contributed to the interpretation and wrote the manuscript. M.S. and R.D.M. 
provided guidance with GPlates and scripts, contributed to the interpretation 
and wrote the manuscript. PJ.T. provided the StagYY convection code, guidance 
on using it and wrote the manuscript. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of the 
paper. Correspondence and requests for materials should be addressed to 
C.M. (claire.mallard@univ-lyon1.fr). 


© 2016 Macmillan Publishers Limited. All rights reserved 


METHODS 


Convection models. The models computed here have similar parameterizations 
to those published in ref. 16, except no surface velocities are imposed here (free 
convection). We solve the non-dimensional equations of mass, momentum and 
heat conservation in a 3D spherical geometry using the code StagYY (ref. 20). The 
flow is incompressible under the Boussinesq approximation. Viscosity is the only 
variable material property in our models. Variations of other material properties 
(expansion coefficient, thermal diffusivity, heat production) are neglected. 
The Rayleigh number Ra is defined here as 


3 
Ra= pga ATL 
Ko 


where p is density, g is the gravitational acceleration, a is the thermal expansivity, 
AT is the temperature drop across mantle depth, L is the mantle thickness, & is 
thermal diffusivity and 7 is the reference viscosity at the base of the mantle. The 
non-dimensional temperature is set to T= 0 at the surface and T'= 1 at the base 
of the mantle. A non-dimensional internal heat production of 20 is chosen, such 
that the basal heat flux is about 14% of the total. This is in the lower range of the 
estimates for heat flow at the core-mantle boundary"?. 

In our models Ra is 10°, which is about 10-50 times lower than is expected for 
Earth, and produces a top boundary layer that is 300-km thick. We were limited 
to this value of Ra because of the computational power required to solve for con- 
vection with large viscosity variations. The average resolution is 45 km laterally 
and vertically for all of the models. 

The viscosity in our models depends on temperature and depth as 


n(T,z) =1,(z)exp (0.064 + 30/(T + 1)) 


where z is the depth. A value of 30 for the non-dimensional activation energy 
produces six orders of magnitude variations in viscosity with temperature. 
The depth dependence of viscosity is taken into account such that 


1 04 wn = 
step 

where B is the factor of the viscosity jump at depth dy over a thickness 2d,tep, and a 

is a prefactor ensuring 1),=‘o for temperature T= 1 at the base of the mantle. Based 

on geoid” and post-glacial rebound modelling’, B is set to 30 here and the viscos- 

ity jump occurs between 750-km and 850-km depth (do is 0.276 and dytep is 0.02). 

Pseudo-plasticity is implemented through a stress dependence of the viscosity 


with yield stress”"!!. When the local stress reaches the yield stress value o,, the 
viscosity is computed as 


n,(2) = aexp}In(B) 


iy 

2€ 
where € is the second invariant of the strain-rate tensor. The StagYY code has been 
benchmarked with such rheology™. Yield stress is the only parameter varied in 
this study. Taking 77= 107? Pa s, the yield stress values that produce plate-like 
behaviour are between 100 MPa and 350 MPa. 

In our models, the viscosity drops by a factor of 10 in the vicinity of ridges, 
where the temperature crosses the solidus temperature, given by a simple linear 
model T,o) = 0.6 + 7.5z, and without a dependence on the melt fraction. This effect 
improves slightly plate-like behaviour and has been used in previous studies!». 
The models are started from ad hoc initial conditions, and run for up to five billion 
years to ensure a statistical steady state and the stability of the dynamic regime. 
Such long runs ensure that the initial conditions are forgotten so they don't affect 
the outcome results. From the solutions at a statistical steady state, we compute the 
dynamic evolutions of the models that are analysed in this study. 

Code availability. The code StagYY is the property of PJ.T. and Eidgenéssische 
Technische Hochschule (ETH) Ziirich and is available on request from P.J.T. 
(paul.tackley@erdw.ethz.ch). 

Building tectonic plates. We established a method to define the boundaries and 
the geometry of tectonic plates on the surface of our convection models. First the 
boundaries need to be identified to define the outline of the plates themselves (plate 
polygons). The same method was applied for each of the 18 snapshots of the models 
we present. This is a relatively small sample because the precise determination of the 
plate layout for one snapshot is very time-consuming. Only three snapshots have 
been studied for model 1 because of the large number of plates (more than 100). 
The GPlates software is used to trace all plate boundaries, interactively building 
digital plate-tectonic layouts. 


Ul 


LETTER 


Identification of major boundaries. The first step here is to identify the major 
and localized boundaries on the surface of the convection models. We use the 
viscosity, temperature and velocity data. The maps of the sea-floor ages obtained 
from the heat flux (Extended Data Fig. 1a) allow the youngest zones (0 Myr old) 
to be identified as mid-ocean ridges and the oldest zones (180-280 Myr old) as 
subduction zones. In the same manner, we use maps of the horizontal divergence 
(Extended Data Fig. 1b) inferred from the surface velocities. Hence, the diver- 
gence zones show the localization of the mid-ocean ridges for dimensionless 
divergence values between 0 and 30,000 and the convergence zones show the 
subduction zones with values between — 15,000 and 0. Transform zones (as our 
model is continuous, there are no faults but shear zones) exist in our models and 
are identified via surface vorticity maps. To minimize the time it takes to inter- 
actively build plate boundary models, mid-ocean ridges and transform zones 
are included in the same group of boundaries. Nevertheless, for the model with 
a yield stress of 150 MPa, we computed a length of about 79,000 km for mid- 
ocean ridges on average and a length of about 2,600 km for transform regions. 
In comparison, these lengths on Earth are 67,000 km for mid-ocean ridges and 
5,131km for transform regions. 

The identification of these two types of major boundaries (subduction zones and 

mid-ocean ridges) does not always allow us to close polygons to obtain tectonic 
plates. Even if some boundaries can be extrapolated, many zones necessitate more 
thorough work, as discussed below. 
Identification of diffuse boundaries. To close polygons, other boundaries need to be 
defined. The study of deviatoric stress allows us to identify some diffuse junctions. 
In the models, non-yielded boundaries are set between two zones where there is 
little change in the velocity vector. They exist in ductile zones that are visible as a 
result of a fan of velocity vectors (Extended Data Fig. 2). This geometric configura- 
tion implies a large zone of deformation similar to that found for intraplate defor- 
mation, which is defined as a diffuse boundary. That is exactly the definition of 
diffuse boundaries on Earth*’. The delimitation of the diffuse boundaries between 
two zones with different velocities implies a non-negligible error in the estimation 
of the Euler pole (and the calculated velocities) that we quantify. 

The identification of these three types of boundaries (mid-ocean ridges, sub- 

duction zones and diffuse boundaries) allows us to close topological polygons 
defined by these boundaries (Extended Data Fig. 1c). These polygons are tectonic 
plates, but before they can be used we need to evaluate the error we made in the 
delimitation of tectonic plates according to the plate tectonic theory. 
Fit of the plate model with the convection model. We compare the raw velocity data 
of the convection models with the a posteriori velocities calculated using Euler’s 
theorem for the corresponding plate layout. We first extract the raw velocity 
data for each plate using the plate polygons determined previously. We then use 
the raw velocities to invert the angular velocity vector using the inverse method 
described in ref. 36, and compute the predicted velocities on the basis of the 
inverted angular velocity vector. As a measure of the quality of the fit of our plate 
model to the convection model, we compute the plateness P of the plate layout 
following ref. 37 


P=1=— AVams / Vims 


where AVim; is the root-mean-square difference between the velocities of the 
convection model and those predicted with plate rotations, and V;ms is the root- 
mean-square surface velocity of the model. We obtain values of P between 0.75 
and 0.81 (1 would be perfectly rigid plates, 0 would absolutely preclude the use of 
plate approximation), which is consistent with the fact that 90% of the deformation 
is concentrated in 15% of the surface of the models. 


31. Lay, T. Hernlund, J. & Buffett, B. A. Core-mantle boundary heat flow. 
Nat. Geosci. 1, 25-32 (2008). 

32. Ricard, Y., Richards, M., Lithgow-Bertelloni, C. & Le Stunff, Y. A geodynamic 
model of mantle density heterogeneity. J. Geophys. Res. 98, 21895-21909 
(1993). 

33. Mitrovica, J. X. Haskell [1935] revisited. J. Geophys. Res. 101, 555-569 
(1996). 

34. Tosi, N. et al. Acommunity benchmark for viscoplastic thermal convection in a 
2-D square box. Geochem. Geophys. Geosyst. 16, 2175-2196 (2015). 

35. Gordon, R. G. in The History and Dynamics of Global Plate Motions (eds 
Richards, M. A. et al.) 143-159 (Geophysical Monograph Series, Vol. 121, 
American Geophysical Union, 2000). 

36. Goudarzi, M. A., Cocard, M. & Santerre, R. EPC: Matlab software to estimate 
Euler pole parameters. GPS Solut. 18, 153-162 (2014). 

37. Zhong, S., Gurnis, M. & Moresi, L. Role of faults, nonlinear rheology, and 
viscosity structure in generating plates from instantaneous mantle flow 
models. J. Geophys. Res. 103, 15255-15268 (1998). 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


——___ een ey microplates: small plates : large plates : microplates : small plates : large plates : 
O- 575 000 - 45 000 000 - 0- 100 000 - 42 500 000 - 
Oe AOE eer leo! OO i200 - «240, «280 ST ee e0O: tte Tc eeeee 0.080 HBS 75 000 45 000 000 100 000 000 100 000 42500 000 100 000 000 
Age of Oceanic Lithosphere [m.y.] Nondimensional horizontal divergence Plate area [km] Plate area [kn] 
Extended Data Figure 1 | Maps of the surface of a snapshot from non-dimensional horizontal divergence, with divergence zones (mid-ocean 
a convection model with a yield stress of 150 MPa and of the plate ridges) shown in red and convergence zones (subduction zones) in blue. 
layout of Earth. a, Map of sea-floor age with the youngest ages in c, d, Maps of the plate sizes of the convection model (c) and Earth (d). 


red characteristic of mid-ocean ridges and the oldest zones in blue The plate size categories are determined in Extended Data Fig. 3. 


characteristic of subduction zones. m.y., millions of years. b, Map of 


© 2016 Macmillan Publishers Limited. All rights reserved 


Dimensionless 
Temperature 
41.15 ‘ 


LETTER 


——» Global surface velocities 
— Fan-shaped velocities 


155 Diffuse zone 
=== Diffuse boundary 


Extended Data Figure 2 | Subsurface temperature of a convection mid-ocean ridges. b, Zoom-in of the red boxed region in a showing a 
model with a yield stress of 150 MPa showing a diffuse plate boundary. diffuse boundary; the steady lateral change of velocity directions 

a, Global temperature (colour scale) and surface velocities (arrows). (red arrows) characterizes the intraplate diffuse zone (grey shaded area), 
The dark zones represent subduction zones and the light zones indicate allowing the determination of a diffuse boundary (black dashed line). 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


18. Snapshot 1 18. Snapshot 2 ; 
i ' 
_ 16 1 1.6- ' 
e y =-0.378x + 3.7323 ! ! 
S 44. Y=-0.0629x + 1.9298 ie , | -y =-0.1349x + 2.321 o ' 
Bode Re = 0.98036 FOE 14 FP = 0.96467 Y= DAAC 
212 ; 1.2) 
roi ' ; 
o 1 ' 1 ' 
2 1 
Oo 1 
5 (08 08 
Ee 
6 06 0.6 
“e = 
@ 04 y =-1.6855x +.13.489 04 y =-1.8011x + 14.504 
a R = 0.89157 Re = 0.69833, 
0.2- ; 0.2 j 
I I 
O-> t 0 1 ——t ta 
5 55 6 65 7 75 5 55 6 65 7 75 7.66 8 
7.48 
18 Snapshot 3 18 Snapshot 4 ; 
i 1 
1.6 "lees ' 1.6 1 
e 1 ! 
i= 
S44. y=-0.1901x + 2.6186 ' 1.4. y=-0.0463x + 1.7943 : 
8 Re = 0.97824 y =-0.5007x + 4.5625 1 = y = -0.4333x + 4.049 1 
G Re = 0.98116 ' Re = 0.98082 : 
212 1 1.2 1 
ro ' : 
o 1 1 1 1 
2 ' 1 
p= 
& ! ' 
S 08 ; 08 , 
5 0.6 0.6 
@ 04 y =-4.2713x + 33.806 0.4 y =-4.1218x + 32.828, 
a Re = 0.84625 Fe=0.86852 
0.2 0.2 i 
1 
5 55 6 65 7 75 7.76 5 55 6 65 7 75 780 8 
18 Snapshot 5 , 18 Earth ; 
1 1 
_ 16 : 1.6 y =-0.03x + 1.8335 : 
= iT) ; Fe = 0.98151 : 
a 14- y= es ya seele ; 14, y = -0.3164x + 3.2748 | 
S =O: ; FR=0.98965 | 
1.2 12 1 
o. 4 
g : ! 1 
co 
2 0.8 08 
3 0.6 0.6 : 
5 0.4, 0.4. ' 
@ 0. y =-1.6449x +;13.267 : y =-2.1205x + 16.961, 
I Pe = 0.76113 Fe=0.88422 | @ 
0.2 ‘ 02 ; 
1 1 
ie} + 0 : ++ 
5 55 6 65 7 75 35 4 45 5 55 6 65 7 75 8 85 
7/57 7'63 


Log; (Plate size km?) 


Log;, (Plate size km?) 


Extended Data Figure 3 | Plots of the logarithm of the cumulative plate 
count versus the logarithm of the plate size for the snapshots of model 2 
and Earth. The data for Earth is taken from ref. 2. The plots show the 


distribution of microplates in light blue, small plates in mid-blue and large 
plates in dark blue. The equations of the black fit lines and the correlation 
coefficients R are also shown. 


© 2016 Macmillan Publishers Limited. All rights reserved 


70% 


Dominance of trenches |Dominance of ridges Dominance of diffuse 
boundaries ® 


60% 


| ‘ = 

30%} 

20%! @) 
@ Earth 
ma Model 2 


Fraction of large plates around the triple junctions 


0% 


T(RRR) TTR TID TIT RRD RAT RRR TDD RDD~ DDD 
Type of triple junction 


Extended Data Figure 4 | Plot of the fraction of large plates adjoining 
a triple junction versus the type of triple junction for model 2 and 
for Earth. The data for Earth is taken from ref. 2. The red rectangles 
correspond to model 2 and the black circles to Earth. The coloured 
backgrounds indicate of dominance of each boundary type: blue shows 
triple junctions that are mainly composed of subduction zones, red 
shows the dominance of mid-ocean ridges or transform boundaries 

and green the dominance of diffuse boundaries. T, trenches; R, ridges; 
D, diffuse boundary. We added a type of triple junction T(RRR); these 
triple junctions are directly connected to curved trenches and produce 
back-arc basins with small plates, hence they are included in the area of 
the plot dominated by subduction zones. The error bars represent the 
standard deviation of the fraction of large plates around a triple junction 
for model 2 and Earth. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


LETTER 


doi:10.1038/nature18326 


Anthropogenic disturbance in tropical forests can 
double biodiversity loss from deforestation 


Jos Barlow!?*, Gareth D. Lennox!, Joice Ferreira’, Erika Berenguer!, Alexander C. Lees*°, Ralph Mac Nally®, 
James R. Thomson®”, Silvio Frosini de Barros Ferraz®, Julio Louzada!*, Victor Hugo Fonseca Oliveira!?, Luke Parry), 
Ricardo Ribeiro de Castro Solar!, Ima C. G. Vieira’, Luiz E. O. C. Aragao!4!, Rodrigo Anzolin Begotti®, Rodrigo F. Braga’, 


Thiago Moreira Cardoso*, Raimundo Cosme de Oliveira Jr*, Carlos M. Souza J r3N argila G. Moura, Samia Serra Nunes!?, 


13 


Joao Victor Siqueira’’, Renata Pardini", Juliana M. Silveira)’, Fernando Z. Vaz-de-Mello!, Ruan Carlo Stulpen Veiga"®, 


Adriano Venturieri* & Toby A. Gardner!”® 


Concerted political attention has focused on reducing 
deforestation!~*, and this remains the cornerstone of most 
biodiversity conservation strategies*°. However, maintaining 
forest cover may not reduce anthropogenic forest disturbances, 
which are rarely considered in conservation programmes®. These 
disturbances occur both within forests, including selective logging 
and wildfires”’, and at the landscape level, through edge, area and 
isolation effects. Until now, the combined effect of anthropogenic 
disturbance on the conservation value of remnant primary forests 
has remained unknown, making it impossible to assess the relative 
importance of forest disturbance and forest loss. Here we address 
these knowledge gaps using a large data set of plants, birds and 
dung beetles (1,538, 460 and 156 species, respectively) sampled in 
36 catchments in the Brazilian state of Para. Catchments retaining 
more than 69-80% forest cover lost more conservation value from 
disturbance than from forest loss. For example, a 20% loss of primary 
forest, the maximum level of deforestation allowed on Amazonian 
properties under Brazil’s Forest Code’, resulted in a 39-54% loss 
of conservation value: 96-171% more than expected without 
considering disturbance effects. We extrapolated the disturbance- 
mediated loss of conservation value throughout Para, which covers 
25% of the Brazilian Amazon. Although disturbed forests retained 
considerable conservation value compared with deforested areas, 
the toll of disturbance outside Parda’s strictly protected areas is 
equivalent to the loss of 92,000-139,000 km/ of primary forest. Even 
this lowest estimate is greater than the area deforested across the 
entire Brazilian Amazon between 2006 and 2015 (ref. 10). Species 
distribution models showed that both landscape and within-forest 
disturbances contributed to biodiversity loss, with the greatest 
negative effects on species of high conservation and functional value. 
These results demonstrate an urgent need for policy interventions 
that go beyond the maintenance of forest cover to safeguard the 
hyper-diversity of tropical forest ecosystems. 

Protecting tropical forests is a fundamental pillar of many national 
and international strategies for conserving biodiversity**. Although 
improved regulatory and incentive measures have reduced deforesta- 
tion rates in some tropical nations’)”, the conservation value of the 


world’s remaining primary forests may be undermined by the addi- 
tional impacts of disturbance, which falls into two broad categories (see 
Methods). First, landscape disturbance results from deforestation itself, 
with area, isolation and edge effects degrading the condition of the 
remaining forests’. Second, within-forest disturbance, such as wildfires 
and selective logging, induces marked changes in forest structure and 
species composition®!%, 

Although the biodiversity consequences of both forms of distur- 
bance are well studied, previous research has overwhelmingly focused 
on identifying the isolated effects of specific types of disturbance!*”>. 
Such studies provide an incomplete understanding of the total 
disturbance-mediated loss of conservation value arising from multiple 
interacting drivers! and are unable to quantify the extent to which 
reducing forest loss will succeed in protecting tropical forest biodiver- 
sity. Addressing these knowledge gaps is vital for informing forest man- 
agement strategies in tropical nations, not least because within-forest 
disturbance can increase even as deforestation rates fall”'”!” and thus 
requires different policy interventions (Extended Data Table 1). 

We estimated the combined effects of landscape and within-forest 
disturbance on biodiversity in primary forests and compared these 
impacts to the biodiversity loss expected in deforested areas, offering, 
to our knowledge, the first such analysis for anywhere in the world. 
Our study focused on two large (>10,000 km”) frontier regions of the 
Brazilian Amazon: Paragominas and Santarém, located in the state of 
Para (see Methods). Large- and small-stemmed plants, birds and dung 
beetles were sampled in 371 plots in 36 study catchments distributed 
along a landscape deforestation gradient (0-94%) (Extended Data 
Fig. 1). A total of 31 catchments contained remnant primary forests. 
Within these catchments, we sampled 175 primary forest plots. Of 
these, 145 had visible evidence of within-forest disturbance (logging 
and/or fire). The remaining 30 had no evidence of within-forest dis- 
turbance and, being located in the largest remaining forest blocks, had 
minimal landscape disturbance'*!” (see Methods). Irrespective of their 
disturbance history, these primary forest plots held considerably more 
forest species than all other major land-uses (Extended Data Fig. 2). 

We used the sum of forest species presences in primary forest plots to 
estimate the conservation value of a catchment (see Methods). As plots 


Lancaster Environment Centre, Lancaster University, Lancaster LAl 4YQ, UK. 2MCTI/Museu Paraense Emilio Goeldi, CP 399, Belém, Para, CEP 66040-170, Brazil. 3Universidade Federal de Lavras, 
Setor de Ecologia e Conservagao. Lavras, Minas Gerais, CEP 37200-000, Brazil. 4EMBRAPA Amazonia Oriental. Belém, Para, CEP 66095-100, Brazil. *Cornell Lab of Ornithology, Cornell University, 
Ithaca, New York 14850, USA. ‘Institute for Applied Ecology, University of Canberra, Bruce, Australian Capital Territory 2617, Australia. “Arthur Rylah Institute for Environmental Research, 
Department of Environment, Land, Water and Planning, 123 Brown Street, Heidelberg, Victoria 3084, Australia. 8Universidade de Sao Paulo, Escola Superior de Agricultura “Luiz de Queiroz”, 
Esalq/USP, Avenida Padua Dias, 11, Sao Dimas, Piracicaba, SP, CEP 13418-900, Brazil. °Universidade Federal do Parad (UFPA), Nucleo de Altos Estudos Amazonicos (NAEA), Av. Perimetral, Numero 1, 


Guamé, Belém-Para, CEP 66075-750, Brazil. !°Universidade Federal de Vicosa, Departamento de Biologia Geral. Av. PH Rolfs s/n. Vigosa, Minas Gerais, CEP 36570-900, Brazil. !!Tropical 
Ecosystems and Environmental Sciences Group (TREES), Remote Sensing Division, National Institute for Space Research (INPE), Avenida dos Astronautas, 


.758, Jd. Granja, Sao José dos Campos, 


CEP 12227-010, SP, Brazil. !@College of Life and Environmental Sciences, University of Exeter, Exeter EX4 4RJ, UK. !SIMAZON, Rua Dom Romualdo de Seixas 1698, Edificio Zion, 11 andar, CEP 
66055-200 Belém, PA, Brazil. !4Instituto de Biociencias, Universidade de Sao Paulo, Rua do Matdo, Travessa 14, 101, CEP 05508-090 Sao Paulo, Brazil. SUniversidade Federal de Mato Grosso, 
Instituto de Biociencias, Departamento de Biologia e Zoologia. Av. Fernando Correa da Costa, 2367, Boa Esperanca, CEP 78060-900, Cuiaba, MT, Brazil. !*Instituto Socio Ambiental Serra do 
Mar (ISASM), Estrada Ribeirao das Voltas s/n, Lumiar, CEP 28616-010, Nova Friburgo, Brazil. !’Stockholm Environment Institute, Linnégatan 87D, Box 24218, Stockholm 104 51, Sweden. 
18!nternational Institute for Sustainability, Estrada Dona Castorina, 124, Horto, Rio de Janeiro, 22460-320, Brazil. 


00 MONTH 2016 | VOL 000 | NATURE | 1 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Conservation value 


Absolute loss (CVD) 


Proportionate loss 


00+ --------------------- 2- 


0.50 0.75 


Primary forest cover 


Figure 1 | The conservation status of primary forests. a, Conservation 
value in Paragominas (circles) and Santarém (triangles). b, Total loss of 
conservation value due to disturbance. c, Total loss of conservation value 
due to disturbance expressed as a proportion of the expected conservation 
value without disturbance. Dashed lines show expectations without 
disturbance. Grey lines show all regressions, with the black solid line 
showing the median response (see Methods). Values were standardized 
across study regions. There was no significant difference in conservation 
values between regions in the median response (Fj,26 = 1.45, P=0.24, 
analysis of covariance (ANCOVA)). 


were allocated in proportion to catchment forest cover, this measure is 
equivalent to the mean species richness (per unit area) in primary for- 
ests multiplied by the proportion of primary forest cover. In the absence 
of landscape or within-forest disturbances, the expectation of conser- 
vation value should respond linearly to forest cover, with slope equal to 
mean species density (see Methods). The difference between this linear 
expectation and the observed conservation value of the remaining pri- 
mary forest provides an estimate of the total biodiversity impact of all 
landscape and within-forest disturbance. We refer to this difference as 
the conservation value deficit (CVD). We take a variety of approaches 


2 | NATURE | VOL 000 | 00 MONTH 2016 


b 
0.84 [Forest loss 13-23 
iy Disturbance 
20-33 
0.6 5 
38-57 


0.4 5 


106-151 


Loss of conservation value 


0.2 5 


0.0 5 


92-137 
i. 135-178 
PA TA GU xl BE RO 


Figure 2 | Conservation value deficit over large spatial scales. 

a, Proportionate loss of conservation value (CV) from disturbance in 
Para (median estimate; see Methods). Areas of endemism (AoE) are: 
Belém (BE), Guiana (GU), Rondénia (RO), Tapajos (TA) and Xingu (XI). 
These do not include the island of Marajé (MA). Grey shading denotes 
strictly protected areas. b, Proportionate loss of CV in Para (PA) and its 
AoEs from forest loss and disturbance (median estimate). Error bars 
show the range over all approaches to estimating conservation value 

(see Methods). Numbers show disturbance relative to forest loss 
(percentage range over approaches). 


to calculating the CVD, reflecting different ways of classifying forest 
species, weighting their conservation value and calculating species den- 
sity in undisturbed forest (see Methods). Here, we report median results 
from our sensitivity analysis along with the lower and upper bound 
range. Full results are shown in Fig. 1 and Extended Data Figs 3 and 4. 

The conservation value of the remaining primary forests was lower 
than expected along the entire deforestation gradient. The CVD was 
unimodal with forest cover, reaching its maximum in catchments with 
83% of their primary forests. These catchments retained just 58% of 
their conservation value (range: 48 to 65%) (Fig. 1a). The CVD was 
relatively small at low levels of forest cover (Fig. 1b). However, distur- 
bance caused the greatest proportionate loss of conservation value in 
these catchments, accounting for an approximately 20-50% shortfall in 
the level of biodiversity that would be predicted for undisturbed forests 
(Fig. 1c). The robustness of our estimates of the CVD was supported 
by the similarity of responses across study regions (Fig. 1) and sampled 
taxa (Extended Data Fig. 3). 

The relationship we derived between forest cover and conservation 
value allowed us, for the first time, to estimate the additional total effect 
of forest disturbance over large spatial scales. We therefore mapped 
the disturbance-induced loss of conservation value (CVD) across 
Para, which covers 1.26 x 10°km?. We divided the state into grid cells 
approximately equal in area to our study catchments (~50km/?). In 
total, 73% of the ~26,000 cells covering the state were located in private 
lands or sustainable-use reserves. For these locations, which are most 


© 2016 Macmillan Publishers Limited. All rights reserved 


Landscape disturbance 


Within-forest disturbance 


LETTER 


Figure 3 | Response of forest birds to 


a ; b : c . ; d ee disturbance. a—d, The odds of detecting 
Paragominas 404 Santarém aragominas 254 antarém species groups along gradients of 
landscape (a, b) and within-forest (c, d) 
55 2.04 : ° E 
ee 3.04 disturbance in Paragominas (a, c) and 
33 154 Santarém (b, d) (see Methods). Species 
oo 2.04 groups, shown by different coloured lines, 
s 3 1.04 are composed of species with similar 
co 1.04 disturbance responses (see Methods). Line 
nas sel ae thickness represents the relative size of 
20 40 60 80 100 40 60 80 100 40 60 80 100 the groups. e-h, Disturbance sensitivity 
Forest cover Forest cover Undegraded forest Undegraded forest of the species groups related to their 
mean range size (10” km’). Error bars 
e f 9g h shows s.e.m. Group colours correspond 
P to groupings in a-d. Black lines show 
1.54 1.54 significant relationships (P < 0.05, F-test) 
oO 
@ 1.24 1.24 (see Methods). 
oa 
29 
66 0.94 0.94 
co 
oO 
2 064 0.64 
0.34 0.34 
-1.0 -05 00 05 08 -04 00 04 10-05 00 05 1.0 
Low High Low High Low High Low High 


Disturbance sensitivity of response groups 


comparable to our study catchments, the total CVD was equivalent to 
~123,000 km? of forest loss (range: 92,000 to 139,000 km’). To put this 
figure in context, it is 51% (range: 38 to 57%) of the total area deforested 
across Para to date (Extended Data Table 2). 

Our state-wide analysis revealed considerable spatial variation in 
the CVD, reflecting differences in deforestation histories (Fig. 2a). 
We illustrate this variation by estimating the additional loss of con- 
servation value due to disturbance across Parda’s five major biogeo- 
graphic zones (areas of endemism”’, AoE). Median disturbance 
impacts outweighed biodiversity losses in deforested areas alone 
in three of the five AoEs (Fig. 2b). The high relative impact of dis- 
turbance is shown in the Guiana AoE, where the predicted loss of 


Landscape disturbance 


b 


Paragominas Santarém 


Relative odds of 
species detection 


20 40 60 80 100 
Forest cover Forest cover 
e f 
BP 08 6 ¢ 0.8 
2 a 4 
O£ 
oe 3,2 8 
OQ q 06 0.6 
2D 
2G 
cs 
Sats) 
= 0.4 0.4 
-0.5 00 05 1.0 -1.0 -05 0.0 0.5 
Low High Low High 


conservation value from disturbance was 135-178% of the losses 
estimated in deforested areas. The relative impact of disturbance was 
lowest in the Belém AoE, which has lost 62% of its native forest cover 
and is the most deforested AoE in Amazonia. Nonetheless, overall 
disturbance effects reduced Belém’s estimated conservation value 
from 38% when based on forest cover alone to just 26% (range: 24 
to 30%). 

The widespread and substantial depletion of conservation value in 
remaining primary forests highlights the pressing need for policies that 
target the most prominent drivers of disturbance-induced biodiver- 
sity loss. Although measures to combat deforestation may help limit 
landscape disturbance, they rarely consider the spatial configuration of 


Within-forest disturbance 


Santarém 


Paragominas 


40 60 80 100 40 60 80 100 
Undegraded forest Undegraded forest 
9 h 
0.8 0.84 
0.6 0.64 
0.4 0.44 
-1.0 -0.5 0.0 0.5 1.0 -1.0 -05 00 05 
Low High Low High 


Disturbance sensitivity of response groups 


Figure 4 | Response of large-stemmed plants to disturbance. a-d, The 
odds of detecting species groups along gradients of landscape (a, b) and 
within-forest (c, d) disturbance in Paragominas (a, c) and Santarém (b, d) 
(see Methods). Species groups, shown by different coloured lines, are 
composed of species with similar disturbance responses (see Methods). 


Line thickness represents the relative size of the groups. e-h, Disturbance 
sensitivity of the species groups related to their mean wood density 

(g cm~*). Error bars show s.e.m. Group colours correspond to groupings 
in a-d. Black lines show significant relationships (P < 0.05, F-test) 

(see Methods). 


00 MONTH 2016 | VOL 000 | NATURE | 3 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


remaining forests or work to actively reduce within-forest disturbance °7! 
(Extended Data Table 1). 

Here we provide insights into the need for additional policies to 
reduce forest disturbance by examining the relative importance of 
landscape and within-forest disturbance on species distributions 
using Random Forests (see Methods). In ranking the importance of 
remotely sensed disturbance measures, we found that both forms of 
disturbance had significant additional effects on species’ distributions, 
albeit with some region- and taxon-specific variation (Extended Data 
Figs 5-7 and Methods). We then used the measures of landscape and 
within-forest disturbance that were most frequently ranked high- 
est to examine changes in taxon community structure, using Latent 
Trajectory Analysis to group species by their responses to disturbance 
(see Methods). Results showed a consistent and high level of com- 
munity turnover from both forms of disturbance, with some species 
groups responding negatively and others positively (Figs 3 and 4 and 
Extended Data Fig. 8). These responses may explain the unimodal 
shape of the disturbance effect (Fig. 1b) because they are consistent 
with the loss of highly sensitive species at relatively low levels of forest 
disturbance and the dominance of more resistant taxa in the most 
disturbed forests. Finally, we linked species’ response groups with life- 
history data available for birds and large-stemmed plants (see 
Methods). Both types of disturbance contributed to marked declines 
in species of high conservation and functional importance (birds 
with smaller range sizes**”? and plants with higher wood density”**, 
respectively) (Figs 3 and 4). These analyses almost certainly underesti- 
mate the adverse effects of disturbance because rare species, which are 
often most sensitive to human impacts in forest ecosystems”’, cannot 
be adequately modelled. 

We provide compelling evidence that Amazonian conservation ini- 
tiatives must address forest disturbance as well as deforestation. At its 
most stringent, Brazil’s centrepiece environmental legislation, the Forest 
Code, mandates Amazonian landowners to maintain 80% of their pri- 
mary forest cover. Our results show that even where this level of com- 
pliance is achieved, the primary forests of these landscapes may only 
retain 46-61% of their potential conservation value and are likely to 
have lost many species of high conservation and functional importance. 
These findings reinforce the need to reduce the effects of landscape 
fragmentation by zoning development activities, thereby ensuring 
the protection of large blocks of remaining forest in all biogeographic 
zones. Where deforestation has already occurred, further conservation 
losses can be minimised by preventing within-forest disturbance, aiding 
the recovery of already degraded forests, and investing in forest resto- 
ration to improve connectivity and buffer remnant forests from edge 
effects. Engendering change will require a mixture of incentive and 
regulatory-based measures to improve the sustainability of both forestry 
and farming practices. Crucially, because reducing forest disturbance 
requires coordinated efforts by many actors, interventions need to 
move beyond individual properties and address entire landscapes and 
regions. Such actions are urgently needed in the Amazon where log- 
ging operations are rapidly expanding across federal and state forests”*, 
wildfires are increasingly prevalent during more frequent and severe dry 
seasons””, and the expansion of industrial agriculture, energy and mining 
threaten even strictly protected areas and indigenous lands”. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 3 March; accepted 17 May 2016. 
Published online 29 June 2016. 


1. Boucher, D., Elias, P., Faires, J. & Smith, S. Deforestation Success Stories: 
Tropical Nations Where Forest Protection and Reforestation Policies Have 
Worked. Union of Concerned Scientists June 2014 Report (2014). 

2. Nepstad, D. et al. The end of deforestation in the Brazilian Amazon. Science 
326, 1350-1351 (2009). 

3. Soares-Filho, B. S. et al. Modelling conservation in the Amazon basin. 
Nature 440, 520-523 (2006). 


4 | NATURE | VOL 000 | 00 MONTH 2016 


4. Convention on Biological Diversity. Strategic Plan for Biodiversity 2011-2020, 
Aichi Biodiversity Targets https://www.cbd.int/sp/default.shtml (2015). 

5. Legislative Database of the Food and Agricultural Organization of the United 
Nations (FAOLEX). Brazilian Environmental Law number 12.651 (25 March 2012). 

6. Panfil, S. N. & Harvey, C. A. REDD+ and Biodiversity Conservation: A review of 
the biodiversity goals, monitoring methods and impacts of 80 REDD+ projects. 
Consen,. Lett. 9, 143-150 (2015). 

7. Aragao, L. E. O. C. & Shimabukuro, Y. E. The incidence of fire in Amazonian 
forests with implications for REDD. Science 328, 1275-1278 (2010). 

8. Burivalova, Z., Sekercioglu, C. H. & Koh, L. P. Thresholds of logging intensity to 
maintain tropical forest biodiversity. Curr. Biol. 24, 1893-1898 (2014). 

9. Ewers, R. M. & Didham, R. K. Confounding factors in the detection of species 
responses to habitat fragmentation. Biol. Rev. Camb. Philos. Soc. 81, 117-142 
(2006). 

10. Instituto Nacional de Pesquisas Espaciais (INPE). Projeto Prodes: Amazon 

deforestation database. Available at www.obt.inpe.br/prodes (2015). 

11. Hansen, M. C. et al. High-resolution global maps of 21st-century forest cover 

change. Science 342, 850-853 (2013). 

12. Sloan, S. & Sayer, J. Forest Resources Assessment of 2015 shows positive 

global trends but forest loss and degradation persist in poor tropical countries. 

For. Ecol. Manage. 352, 134-145 (2015). 

13. Barlow, J. & Peres, C. A. Avifaunal responses to single and recurrent wildfires in 

Amazonian forests. Ecol. Appl. 14, 1358-1373 (2004). 

4. Lewis, S. L. Edwards, D. P. & Galbraith, D. Increasing human dominance of 

ropical forests. Science 349, 827-832 (2015). 

15. Gibson, L. et al. Primary forests are irreplaceable for sustaining tropical 

biodiversity. Nature 478, 378-381 (2011). 

16. Malhi, Y., Gardner, T. A., Goldsmith, G. R., Silman, M. R. & Zelazowski, P. Tropical 

Forests in the Anthropocene. Annu. Rev. Environ. Resour. 39, 125-159 (2014). 

17. Morton, D. C., Le Page, Y., DeFries, R., Collatz, G. J. & Hurtt, G. C. Understorey 

ire frequency and the fate of burned forests in southern Amazonia. Phil. Trans. 

R. Soc. B 368, 1-8 (2013). 

18. Gardner, T. A. et al. A social and ecological assessment of tropical land uses at 
multiple scales: the Sustainable Amazon Network. Phil. Tran. R. Soc. B 368, 
20120166 (2013). 

19. Berenguer, E. et al. A large-scale field assessment of carbon stocks in 

human-modified tropical forests. Glob. Chang. Biol. 20, 3713-3726 (2014). 

20. da Silva, J. M. C., Rylands, A. B. & Da Fonseca, G. A. B. The fate of the 

Amazonian areas of endemism. Conserv. Biol. 19, 689-694 (2005). 

21. International Union of Forest Research Organizations (IUFRO). Understanding 

Relationships between Biodiversity, Carbon, Forests and People: The Key to 

Achieving REDD+ Objectives (eds Parrotta, J. A. Wildburger, C. & Mansourian, S.) 

(2012). 

22. Manne, L. L., Brooks, T. M. & Pimm, S. L. Relative risk of extinction of passerine 

birds on continents and islands. Nature 399, 258-261 (1999). 

23. Purvis, A., Gittleman, J. L., Cowlishaw, G. & Mace, G. M. Predicting extinction 

risk in declining species. Proc. Biol. Sci. 267, 1947-1952 (2000). 

24. Chave, J. et al. Towards a worldwide wood economics spectrum. Ecol. Lett. 12, 

351-366 (2009). 

25. Phillips, O. L. et al. Drought sensitivity of the Amazon rainforest. Science 323, 

1344-1347 (2009). 

26. Baker, T. R. et al. Variation in wood density determines spatial patterns in 

Amazonian forest biomass. Glob. Chang. Biol. 10, 545-562 (2004). 

27. Banks-Leite, C. et al. Assessing the utility of statistical adjustments for imperfect 
detection in tropical conservation science. J. Appl. Ecol. 51, 849-859 (2014). 

28. Gestao de Florestas Publicas - Relatério 2015. Brasilia: MMA/SFB Available at 
http://www.florestal.gov.br/publicacoes/instrumento-de-gestao (2015). 

29. Chen, Y. et a/. Forecasting fire season severity in South America using sea 
surface temperature anomalies. Science 334, 787-791 (2011). 

30. Ferreira, J. et al. Environment and Development. Brazil’s environmental 
leadership at risk. Science 346, 706-707 (2014). 


Acknowledgements This work was supported by grants from Brazil 

(CNPq 574008/2008-0, 458022/2013-6, and 400640/2012-0; Embrapa 
SEG:02.08.06.005.00; The Nature Conservancy — Brasil; CAPES scholarships) 
the UK (Darwin Initiative 17-023; NE/FO1614X/1; NE/GO00816/1; NE/ 
FO015356/2; NE/I018123/1; NE/KO16431/1), Formas 2013-1571, and 
Australian Research Council grant DP120100797. Institutional support was 
provided by the Herbario IAN in Belém, LBA in Santarém and FAPEMAT. R.M. 
and J.R.T. were supported by Australian Research Council grant DP120100797. 
This is paper no. 49 in the Sustainable Amazon Network series. 


Author Contributions T.A.G., J.F. and J.B. designed the research with additional 
input from E.B., A.C.L., S.F.B.F, J.L., V.H.F.O., L.P, R.R.C.S., |.C.G.V., L-E.0.C.A. and 
R.P. E.B., A.C.L., V.H.F.0., R.R.C.S, R.F.B., J.F, R.C.O., N.G.M. R.C.S.V., J.L., J.M.S 
and F.Z.V. collected the field data or analysed biological or soil samples. G.D.L. 
analysed the data, with input from J.B., J.R.T., R.M., A.C.L. and T.A.G. S.F.B.F., 
R.A.B., T.M.C., C.M.S., S.S.N., J.V.S., A.V. and T.A.G. processed the remote sensing 
data. J.B., G.D.L., J.F., A-C.L, R.M., J.R.T. and T.A.G. wrote the manuscript, with 
input from all authors. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of the 
paper. Correspondence and requests for materials should be addressed to 
J.B. Gosbarlow@gmail.com). 


© 2016 Macmillan Publishers Limited. All rights reserved 


METHODS 


No statistical methods were used to predetermine sample size. The experiments 
were not randomized, and investigators were not blinded to allocation during 
experiments and outcome assessment. 

Study regions. Para is the second largest state in Brazil and a focal point for 
deforestation, accounting for 34% of all forest loss in the Brazilian Amazon between 
1988 and 2015 (ref. 10). It holds exceptionally high biodiversity, with ~10% of 
the world’s bird species and five of the eight major AoEs in Amazonia”’. Within 
Para, we focused on two geographically and biologically distinct regions: the 
municipalities of Paragominas and Santarém-Belterra-Mojui dos Campos (abbre- 
viated to Santarém) (Extended Data Fig. 1). These regions lie in different AoEs 
(Belém and Tapajés, respectively) and shared just 49% of our sampled taxa. 
Although they differ in their human colonization history!’, both retain >50% of 
their native forest cover. 

Study design and biodiversity sampling. We divided each region into third- or 
fourth-order drainage catchments. In each region, 18 study catchments 
(32-61 km?) were then distributed along forest cover gradients. We distributed 
study plots on terra firme in proportion to forest and non-forest cover at a density 
of approximately 1 plot per 4km/’, resulting in 8-12 plots separated by >1.5km 
in each catchment (Extended Data Fig. 1). Forest plots (n = 234) were distributed 
without previous knowledge of anthropogenic disturbance!® and included pri- 
mary forests (that is, under permanent forest cover; n= 175) and secondary forests 
recovering after agricultural abandonment (n = 59). Non-forest plots (n= 133) 
were predominantly located in pastures (1 = 76) and mechanised agricultural 
lands (n=31). 

In total, 31 of the 36 catchments contained primary forest plots. In Paragominas 
and Santarém respectively, these included undisturbed (13 and 17), logged (44 
and 26), burned (0 and 7) and logged and burned primary forests (44 and 24)’. 
Disturbance categories were based on field assessments of fire scars, charcoal and 
logging debris, and an analysis of canopy disturbance, deforestation and regrowth 
in time series satellite images (1988 to 2010)'*. Plots in the undisturbed forest had 
no evidence of within-forest disturbance and, because they were located more than 
2km from edges in the largest forest blocks, had minimal landscape disturbance. 
Observations of hunting-sensitive large game birds, such as razor-billed curassow 
Pauxi tuberosa and trumpeters Psophia spp.*'*”, indicated low hunting pressure** 
in undisturbed plots. 

Biodiversity surveys occurred during 2010 and 2011. The following descrip- 
tions apply to sampling at the plot level. Large and small stems: live trees and 
palms with >10cm diameter at breast height were identified in 10 x 250m plots. 
Smaller individuals (2-10 cm diameter) were sampled in five 5 x 20m subplots 
(Extended Data Fig. 1). Liana diameters were measured at 1.3m from the main 
root. Large- and small-stemmed plants were analysed separately because they may 
differ in their disturbance responses. Individuals were identified to species level 
by local parabotanists’. In total across all catchments, 175 plots and 825 subplots 
were sampled in primary forests. Birds: there were two repeat surveys of 15-min 
point counts at three sampling points (0, 150 and 300 m) (Extended Data Fig. 1). 
Sampling was undertaken between 15 min before dawn and 09:30. Lists of voucher 
sound-recordings and images are available for both regions*!. In total across all 
catchments, 1,050 point counts were undertaken in primary forests. Dung beetles: 
sampled using pitfall traps (14 cm radius, 9 cm height) baited with 50 g of dung 
(80% pig and 20% human) and half filled with a killing solution (5% detergent 
and 2% salt). Traps were left for 48h before inspection. Three traps were placed at 
the corners of a 3m equilateral triangle, repeated at three sampling points (0, 150 
and 300m). In total across all catchments, 1,575 pitfall traps were set in primary 
forests (Extended Data Fig. 1). 

Defining the biodiversity consequences of forest loss, landscape and within- 
forest disturbance. We limit the biodiversity consequences of forest loss to those 
that occur in deforested areas themselves, excluding any additional effects on 
remaining forests. Landscape disturbance then captures the combined edge, area 
and isolation effects that accompany the deforestation process. Within-forest 
disturbance refers to anthropogenic disturbance events that are not inevitable 
consequences of forest loss or land cover change, including wildfires, hunting 
and selective logging. Although often associated with landscape factors, such as 
distance from forest edge, within-forest disturbance can occur independently of 
changes in forest cover or landscape configuration. 

Estimating the conservation value deficit. We used the sum of forest species 
presences in primary forest plots to measure the conservation value of a catchment. 
In practice, this means that if a forest species occurs on x plots within a catchment, 
the species contributes x to the catchment’s conservation value. Total catchment 
conservation value is found by summing presences over all forest species. This 
measure is equivalent to mean species richness (per unit area) in primary forests 
multiplied by primary forest cover. In the absence of disturbance, conservation 


LETTER 


value should therefore respond linearly to forest cover, with slope equal to mean 
species density, d.. We term the difference between this linear expectation and a 
catchment’s observed conservation value as its conservation value deficit (CVD). 
We took a variety of approaches to calculating the CVD, reflecting different 
methods of defining forest species, weighting their importance, and calculating d.. 
Defining forest species. We restricted our analysis to ‘forest species’ to avoid 
attributing value to invasive and open-area species. We used three species classi- 
fication filters: (i) an automatic filter defined forest species as those that occurred 
at least once in a primary forest plot, irrespective of the plot’s disturbance history 
(n= 1,621 species); (ii) a high basal area (HBA) filter defined forest species to 
be those that occurred at least once in plots with a high average basal area (that 
is, greater than or equal to the lowest basal area recorded in undisturbed forests 
in each region) (n = 1,290); and (iii) a convex hull filter where we first applied a 
two-dimensional non-metric multidimensional scaling (MDS) to primary and 
secondary forest plots based on a stem-size classification (stress = 0.14), and then 
defined forest species to be those that occurred at least once in plots within the 
minimum convex hull of undisturbed primary forest plots on the MDS (n= 1,140). 
Species conservation or functional importance. We used three approaches to 
weight species’ importance. First, we assumed that all forest species had a value 
equal to 1. Second, we applied a linear weighting to birds according to their range 
size and plants according to their wood density. The bird species with the smallest 
range size was given a value of 1, and that with the largest range size was given a 
value of 0 (vice versa for plants and wood density), with all other species’ values 
mapped linearly between these two points. Third, we squared the linear weighting 
to give an even higher relative value to species of highest conservation or functional 
importance. 

There are many important life-history traits that correlate with species’ conser- 
vation or functional importance. Our choices were based on a priori knowledge 
and the availability of data for diverse tropical taxa. For birds, we chose range size 
because it is the single most important predictor of threat status”’, especially among 
lowland passerines (which make up the majority of our sample) where it is inversely 
correlated with other important factors such as population density”. For plants, we 
chose wood density because it is the most important size-independent determinant 
of carbon storage within individual stems, a strong predictor of carbon stocks 
across the biome”, and is also linked with other functional properties” including 
drought resistance”’. Bird range sizes were extracted from the Birdlife Datazone 
(http://www birdlife.org/datazone/index.html). Wood densities were adapted from 
the global wood density database*, using the genus or family average where species 
or genus data were unavailable. Lianas were given a nominal value of 0.01. 

As part of the broader sensitivity analysis we also undertook the same analy- 
sis described above for birds replacing species range size for species mean body 
size (body size data was also extracted from Birdlife Datazone). This analysis was 
undertaken to determine whether the population density of birds, which is strongly 
and inversely correlated with body size, significantly affected results. It did not: 
the median estimate of the disturbance impact decreased by just 0.5%, and we do 
not report the full results here. 

Alternative undisturbed baselines. Estimating d, (mean species density in 
undisturbed landscapes) requires species distribution data from catchments 
with no within-forest or landscape disturbance. As we do not have a set of such 
catchments in either region, we took three approaches to calculating d.. The first 
two approaches rely on the least disturbed catchment in each region. In both 
Paragominas and Santarém, this reference catchment had minimal landscape dis- 
turbance (>99% primary forest). However, ground-based observations indicated 
that either selective logging or wildfire had affected at least 25% of the sampling 
plots within the reference catchments in both regions. We therefore calculated d, 
as the mean species density over all plots in the reference catchment and, to correct 
for within-forest disturbance, as the mean density over only undisturbed reference 
catchment plots. Finally, to account for potential biases in underlying (natural) 
species distributions, we also calculated d, using all undisturbed plots through- 
out each region (n= 30). This represents a more conservative estimate because it 
includes plots in catchments with less than 100% forest cover. 

Selecting representative estimates of CVD. Combining the three forest species 
selection methods, the three species’ weighting approaches, and the three esti- 
mates of d, returns 27 estimates of the CVD. For all approaches, we determined 
the average CVD with respect to primary forest cover by modelling the catch- 
ments’ summed presences with Poisson polynomial generalized linear models. 
We selected the best fitting model over all polynomials of degree up to cubics. 

To express uncertainty over our estimates of the CVD, in the main text we 
present the median relationship between conservation value and forest cover along 
with the lower and upper bound range. We excluded from this range the estimate of 
d, that included disturbed reference catchment plots because it is not reflective of 
species density in the absence of disturbance. For the purposes of comparison, we 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


have included these results in Extended Data Fig. 4. The median, lower and upper 
bound estimates of the CVD were given by, respectively: the convex hull filter, 
linear species weighting, and undisturbed reference catchment plots; the convex 
hull filter, no species weighting, and all undisturbed plots; and the high basal area 
filter, exponential species weighting, and undisturbed reference catchment plots. 
Adjusting for proportionality. Although the number of plots in catchments was 
proportional to forest cover, proportionality was not exact because the original 
distribution was based on the extent of both primary and secondary forests'®. 
We therefore corrected sampling effort by calculating the factor required to make 
sampling proportional to primary forest cover in each catchment and scaled our 
estimates of conservation value accordingly. For each catchment i, this factor is 
given by p;/t;, where p; is the proportion of catchment i that is primary forest and 
t;is the number of primary forest transects in catchment i. 

Extrapolating the CVD. To estimate disturbance impacts throughout Para, we 
divided the state into grid cells approximately equal in size to our study catchments. 
We then used Brazil’s 2010 Terraclass product* to determine the area of each cell 
that was deforested, first removing non-forested areas that were covered by water 
or tropical savannah. We then calculated each cell’s conservation value by applying 
the median, lower and upper bound estimates of the CVD. The disturbance impact 
in forest loss equivalent terms for cell i is given by p; — (a;— nj)vi, where pj, aj, 11 
and v; are, respectively, the cell’s primary forest extent, area, non-forest area and 
conservation value. 

Linking landscape and within-forest disturbance with species distributions 
and traits. We investigated the importance of landscape and within-forest distur- 
bance at the plot level rather than the catchment level because many disturbance 
drivers act at local scales*!3. Variables representing landscape and within-forest 
disturbance were based on the analysis of georeferenced 30 m resolution Landsat 
TM (Thematic Mapper) and eTM images from 1988 to 2010 in Paragominas and 
1990-2010 in Santarém. These were complemented by covariates that represent 
natural variation in soil conditions, elevation and slope. A full description of the 
data are available elsewhere!®. Variable abbreviations match those in Extended 
Data Figs 5-7. 

Within-forest disturbance. We measured the cumulative extent of canopy distur- 
bance*® by calculating the percentage of the remaining primary forest in a 1 km 
buffer around each plot that had never been classified as disturbed (undisturbed 
primary forest, UPF). We also included two measures of the frequency of distur- 
bance within plots: the number of times the plot was logged (NL) and the number 
of times the plot was burnt (NB) in visual inspections of satellite images or field 
observations. 

Landscape disturbance. We used two landscape configuration measures: the density 
of forest-agriculture edges (ED) and the percentage of primary and secondary 
(>10 years old) forest cover (FC) in 1 km buffers around plots. We used two meas- 
ures of landscape history*’: the deforestation curvature profile (DC) and the land- 
use intensity profile (LI) in 500 m buffers around plots. 

Natural environmental covariates. We used soil samples and digital elevation mod- 
els to derive covariates reflecting natural conditions. Soil variables were based on 
average values from five 30cm deep soil profiles in each plot, and include acidity 
(pH), clay content (Cl), and carbon stock (Ca). We applied a 100 m buffer around 
each plot in a digital elevation model to calculate mean plot elevation (El) and 
slope (SI). 

Linking landscape and within-forest disturbance with species distributions and 
traits. We used Random Forests (RF), a decision-tree classification methodology, 
to identify species that are well-modelled by our data and to rank the importance 
of individual variables in accounting for species distributions. RF was adapted for 
spatial autocorrelation within catchments using a modified ‘residual autocorrela- 
tion’ approach**. The fit of the RF models and their predictive performance was 
measured using area under receiver-operator curves (AUC)*”. AUC evaluates the 
ability of models to correctly predict higher probability of occurrence where spe- 
cies are present than where they are absent. An AUC value of 1 indicates perfect 
discrimination; a value of 0.5 suggests predictions no better than random. We 
performed multiple cross-validations to evaluate model predictive performance. 
For each species, data from each study catchment were used in turn as test data for 
models built with data from the other catchments. The cross-validated AUC value, 
AUCcy, was calculated as the average AUC value over all cross-validation tests for 
each species. Species present on a minimum of three transects and with a summed 


AUCcv > 0.6 over all variables were classified as well-modelled and included in the 
analyses (31% of species). The importance of a variable was measured as its mean 
AUCcv over all well-modelled species. 

Models included the within-forest disturbance, landscape disturbance and nat- 
ural environment covariates described above. Given multicollinearity, we selected 
two variables from each group using three variable-selection methods: (i) we 
selected variables that we hypothesized to have the greatest influence on species’ 
presences (hypothesis-driven selection); (ii) we used principal component analysis 
(PCA) on the full set of variables in each group and selected the highest loaded 
variable on the first two principal axes (PCA selection); and (iii) we ran RF on the 
full set of variables and selected the two highest ranked in each group (step-wise 
selection). Results for each method are shown in Extended Data Figs 5-7. 

Next, we used RF to determine species’ partial responses along disturbance 
gradients (Figs 3 and 4 and Extended Data Fig. 8). These partial responses give the 
relative odds (exp(logit(p) — mean(logit(p)), where p is the probability of species’ 
presence and logit is In(p/(1 — p))) of detecting each species along a single variable 
gradient, holding all other variables constant. For this analysis we selected the 
landscape and within-forest disturbance variables that were most frequently ranked 
highest in their group across the three variable selection methods. 

We then used latent trajectory analysis (LTA), which groups species’ partial 
responses into homogenous classes, to characterize the main types of response to 
the selected variables. We built models with up to eight classes and selected that 
with the lowest Bayesian information criterion (BIC) score. LTAs were carried out 
in R package ‘Icmm http://cran.r-project.org/web/packages/Icmm/Icmm.pdf. In 
Figs 3 and 4, we show the LOWESS smoothed response of each species class along 
the associated disturbance gradient, with bandwidth set to 0.75. 

Finally, we investigated the relationship between the disturbance sensitivity of 
species classes, as determined by LTA, and species traits. To undertake this analysis, 
we defined a metric that represents the propensity of species classes to be detected 
along the variable gradients, which thus provides a measure of the sensitivity of 
the class to disturbance. The measure is: 


(m — x)d-(x)dx 


=~ 
oN 
a 
Ro 
ll 
| 
3 
Nar 
& 
aS 
a 
x 
a 
| 
~o4,8 


where m, l and u are, respectively, the gradient’s mid-point and lower and upper 
bounds, and d,(x) is the relative odds of detecting species class c at point x on the 
gradient, as determined by RE. We scaled h, to lie between +1 and —1. Values of 
h, close to 1 indicate that species class c is much more likely to be detected at the 
maximum than minimum extreme of the gradient, values close to —1 indicate that 
species class c is much more likely to be found at the minimum than maximum 
extreme. Values near 0 indicate that species class c is equally likely to be detected at 
either extreme. We tested the relationship between h, and species’ traits by fitting 
polynomial models weighted by group size. In all cases, the response variable was 
the average value of the species trait over all species in each class. We investigated 
polynomial fits up cubics and selected that with the lowest BIC score. 


31. Lees, A. C. et al. One hundred and thirty-five years of avifaunal surveys around 
Santarem, central Brazilian Amazon. Rev. Bras. Ornitol. 21, 16-57 (2013). 

32. Lees, A. C. et al. Paragominas: a quantitative baseline inventory of an eastern 
Amazonian avifauna. Rev. Bras. Ornitol. 20, 93-118 (2012). 

33. Barrio, J. Hunting pressure on cracids (Cracidae: Aves) in forest concessions in 
Peru. Rev. Peru. Biol. 18, 225-230 (2011). 

34. Zanne A. E. et al. Data from: Towards a worldwide wood economics spectrum. 
Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.234 (2009). 

35. Instituto Nacional de Pesquisas Espaciais (INPE). Terraclass data 2010; 
available at http://www.|npe.Br/cra/projetos_pesquisas/terraclass2010 (2013). 

36. Souza, C. M. Jr. et al. Ten-year landsat classification of deforestation and forest 
degradation in the Brazilian Amazon. Remote Sens. 5, 5493-5513 (2013). 

37. Ferraz, S. F. D., Vettorazzi, C. A. & Theobald, D. M. Using indicators of 
deforestation and land-use dynamics to support conservation strategies: 

A case study of central Rondonia, Brazil. For. Ecol. Manage. 257, 1586-1595 
(2009). 

38. Crase, B., Liedloff, A.C. & Wintle, B. A. A new method for dealing with residual 
spatial autocorrelation in species distribution models. Ecography 35, 879-888 
(2012). 

39. Pearce, J. & Ferrier, S. Evaluating the predictive performance of habitat models 
developed using logistic regression. Ecol. Model. 133, 225-245 (2000). 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Non-forest area 
Primary forest 
Secondary forest 


Okm_30km_60km 
[ 


x Bird & beetle sampling point 


BB sma stem sampling 
Large stem sampling 


20m 

Eee 8: 
2 3 
e] 


x 


0 50 100 150 200 250 300 m 


Extended Data Figure 1 | Study design. a, The location of Paragominas and Santarém within Para. b, c, The distribution of study catchments (n = 36) 
within Paragominas and Santarém, respectively. d, The distribution of study plots (n = 175) in example catchments spanning the gradient of primary 
forest. Selected catchments are shown in red in a and b. e, Sampling design within each plot. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a Large stems Small stems Beetles 
1.575 

O_o eo SE = 
0.5- 

oo, | | eh os 

— . oO 

b 

1.575 

1) 

ip) 

oO 

Cc 
& 
2 
o& 

1S) 

oO 

[or 

no 
Los 

® 

i) 

~ 

0.0- at __ 

T T T T T 

Cc 

1.55 


0.575 


0.0- 


e 
ei . 7 
T T T T I T T T T 
PA AG SF PA AG SF SF PA AG 


Extended Data Figure 2 | Richness of forest species. a—c, The richness 
of forest species in secondary forests (SF), pastures (PA), and mechanised 
agricultural lands (AG) relative to the average richness of forest species 


Paragominas (green) and Santarém (orange). Panels show the convex 
hull (a), automatic (b) and high basal area filters (c) used to classify forest 
species (see Methods). 


in all undisturbed and disturbed primary forests (dashed line) in 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Large stems b Small stems 


Conservation value 


Conservation value 


0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 
Primary forest cover Primary forest cover 
Extended Data Figure 3 | Conservation value of primary forests Grey lines show all regressions, with the black solid line showing the 
measured by individual taxa. a—d, Estimates of conservation value in median response (see Methods). Values were standardized across study 
the Paragominas (circles) and Santarém (triangles) study regions from regions and taxa. There was no significant difference between taxa in the 
large-stemmed plants (a) small-stemmed plants (b) birds (c) and median estimate (F3,117 = 1.36, P=0.26, ANCOVA). 


dung beetles (d). Dashed lines show expectations without disturbance. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Conservation value 


0.00 0.25 0.50 0.75 1.00 
Primary forest cover 


Extended Data Figure 4 | Range of conservation value estimates using three alternative sets of reference plots. Mean species density (d,) is measured 
by: all disturbed and undisturbed plots in the least disturbed reference catchments (grey shaded region), all undisturbed plots throughout a region (green 
shaded region), and undisturbed plots in the reference catchments (purple shaded region). See Methods for details. 


© 2016 Macmillan Publishers Limited. All rights reserved 


0.00 


0.25 0.50 0.75 


{I ++ 
f--- +8 


0.00 0.25 0.50 0.75 
AUC, 


Extended Data Figure 5 | The importance of hypothesis selected variables. 
a-h, Species AUCcv values for each variable in Paragominas (a, c, e, g) and 
Santarém (b, d, f, h) for large-stemmed plants (a, b), small-stemmed plants 
(c, d), birds (e, f) and beetles (g, h). Variable importance was measured by 
the mean AUCcv over all well-modelled species (see Methods). Variable 
colours denote group membership: green, orange and blue represent 


LETTER 


PFD 


0.0 0.2 0.4 0.6 


landscape disturbance, within-forest disturbance and natural variables, 
respectively (see Methods for variable descriptions). Letters show the 
results for multiple pair-wise comparisons of group means using Tukey’s 
range test. Variables which do not share a letter have significantly different 
mean importance (P< 0.05). 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


0.00 0.25 0.50 0.75 


AU Coy 
Extended Data Figure 6 | The importance of PCA selected variables. represent landscape disturbance, within-forest disturbance and natural 
a-h, Species’ AUCcv values for each variable in Paragominas (a, ¢, e, g) variables, respectively (see Methods for variable descriptions). Letters 
and Santarém (b, d, f, h) for large-stemmed plants (a, b), small-stemmed show the results for multiple pair-wise comparisons of group means using 
plants (c, d), birds (e, f) and beetles (g, h). Variable importance was Tukey’s range test. Variables which do not share a letter have significantly 


measured by the mean AUCcv over all well-modelled species (see Methods). —_ different mean importance (P< 0.05). 
Variable colours denote group membership: green, orange and blue 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


0.00 0.25 0.50 0.75 1.00 


UPF 


0.00 0.25 0.50 0.75 


Extended Data Figure 7 | The importance of step-wise selected variables. 


a-h, Species’ AUCcv values for each variable in Paragominas (a, ¢, e, g) 
and Santarém (b, d, f, h) for large-stemmed plants (a, b), small-stemmed 
plants (c, d), birds (e, f) and beetles (g, h). Variable importance is 
measured by the mean AUCcv over all well-modelled species (see 
Methods). Variable colours denote group membership: green, orange 


0.00 0.25 0.50 0.75 1.00 


PDF 
UPF 


0.00 0.25 0.50 0.75 1.00 


ma 
fi—- as 


AB 


00 02 04 06 
AUC,y 


and blue represent landscape disturbance, within-forest disturbance and 
natural variables, respectively (see Methods for variable descriptions). 
Letters show the results for multiple pair-wise comparisons of group 
means using Tukey’s range test. Variables which do not share a letter have 
significantly different mean importance (P < 0.05). 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Landscape disturbance 


a  Paragominas b Santarém 
25 3.0 
—2.0 
= 2.0 
3 1.5 
(0) 
To 
@ 1.0 1.0 
ra) 
a 
2.0.5 
re) 
ne f 
ne} 
6 1.4 
2.0 
£ 
oO 
o 1.5 y A 
a 1.0 
1.0 
0.6 


25 50 75 
Forest cover 


20 40 60 80 100 
Forest cover 


Within-forest disturbance 


C  Paragominas d Santarém 
2.0 3.0 
2.5 
1.5 
2.0 
1.5 
1.0 
: 1.0 
0.5 0.5 
g h 
2.5 
1.5 oa 
1.5 
1.0 1.0 
0.5 


40 60 80 100 


Undegraded forest 


40 60 80 100 
Undegraded forest 


Extended Data Figure 8 | Responses of small-stemmed plants and 
dung beetles to disturbance. a—h, The odds of detecting small-stemmed 
plants (a-d) and dung beetles (e-h) species groups along gradients of 
landscape disturbance (a, b, e, f) and within-forest disturbance (c, d, g, h) 


in Paragominas (a, c, e, g) and Santarém (b, d, f, h) (see Methods). Species 
groups, shown by different coloured lines, are composed of species with 
similar disturbance responses (see Methods). Line thickness represents the 
relative size of the groups. 


© 2016 Macmillan Publishers Limited. All rights reserved 


Extetnded Data Table 1 | Policy interventions used to reduce deforestation and their effect on disturbance 


Policy intervention 


Protected areas (IUCN classes 
I-IV) 


Sustainable-use reserves 
(IUCN class VI) 


Legal stipulation to maintain 
forest cover on private lands 


Agricultural intensification on 
deforested lands 


Industrial and community 
based reduced impact logging 


Protecting forests through 
moratoria & certification 


Direct effects on reducing land- 
scape disturbance 


Positive if there is no leakage of 
deforestation 


Positive if there is no leakage of 
deforestation 


Positive, but there is no stip- 
ulation to consider landscape 
configuration 


Positive if this prevents further 
forest loss 


Negative if increased profits en- 
courage further land-use change 


Negative if the matrix becomes 
more hostile to forest species, 
increasing isolation 


Positive if economic returns 
protect forests from clearance 


Negative when new roads and 
logging patios increase edge- 
effects and isolation 


Positive if this prevents further 
forest loss and there is no leak- 
age of deforestation 


Direct effects on reducing 
within-forest disturbance 


Positive if park management is 
effective and leakage of logging 
is avoided 


Positive where more sustainable 
approaches replace conventional 
approaches, and if leakage of 
logging is avoided 


Negative if forest-use is incen- 
tivised in areas that would not 
otherwise be disturbed 


No likely impact without addi- 
tional measures 


Positive if reduced fire use in 
agriculture prevents wildfires 


No likely impact on selective 
logging or hunting 


Negative if there are new 
spillover effects from agricul- 
ture, such as deposition of nutri- 
ents and pesticides 


Positive if more sustainable 
approaches replace conventional 
logging 

Negative when logging is incen- 
tivised in undisturbed forests 


No likely impact without addi- 
tional measures 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


LETTER 


Extended Data Table 2 | Forest loss and disturbance in Para and its areas of endemism 


Region 


Para state 
Belém AoE 
Guiana AoE 

Rondénia AoE 
Tapajos AoE 
Xingu AoE 


Para state 
Belém AoE 
Guiana AoE 

Rond6énia AoE 
Tapajés AoE 
Xingu AoE 


Para state 
Belém AoE 
Guiana AoE 

Rond6énia AoE 
Tapajos AoE 
Xingu AoE 


Area 


1,259,916 


138,351 
273,692 
66,222 
418,201 
321,304 


918,694 
136,405 
157,288 
55,109 

271,761 
253,884 


530,931 
129,570 
55,068 
33,676 
125,089 
196,824 


All land (a) 


Forest area Forest loss 


1,141,659 


131,637 
246,802 
56,876 
388,738 
294,543 


820,636 
129,691 
137,468 
45,832 

250,309 
232,681 


490,200 
125,209 
42,069 
28,033 
118,409 
182,752 


245,288 
80,861 
12,290 
5,728 
37,783 

109,687 


242,578 
80,432 
12,084 
5,559 
36,466 

109,029 


Private lands (c) 


230,293 
79,072 
11,615 
4,551 
31,401 

106,276 


Disturbance 


171,695 (130,827-187,328) 


15,431 (10,904-18,743) 


37,044 (29, 134-37,124) 


9,366 (7,227-10,103) 


65,221 (50,117-70,95 1) 
40,502 (30,173-45,918) 


Private lands + sustainable use reserves (b) 


122,881 (92,144-139,033) 


15,099 (10,652-18,358) 


20,879 (16,278-21,497) 


7,631 (5,859-8,379) 


43,980 (33,303-49,795) 
30,891 (22,572-36,260) 


68 694 (49,73 1-82,024) 


14,324 (10,077-17,435) 
6,567 (4,9 12-7,491) 
4,692 (3,574-5,248) 


21,982 (16,126-26,436) 
22,159 (15,783-26,675) 
a-c, The loss of primary forest conservation value from forest loss and forest disturbance in forest loss-equivalent terms across ~50 km? cells covering all land in Par (a), private lands and 


sustainable use reserves only (b), and private lands only (c). Disturbance losses are calculated using the median estimate of conservation value with the lower and upper bound range in parentheses 
(see Methods). Area is the total area of the region in km2. Forest area gives the area of the region that was or is primary forest cover in km2. Forest loss gives the total loss of primary forest in km?. 


Relative (%) 


70 (53-76) 
19 (13-23) 
301 (237-302) 
163 (126-177) 
173 (133-188) 
37 (28-42) 


51 (38-57) 
19 (13-23) 
173 (135-178) 
137 (105-151) 
121 (91-137) 
28 (21-33) 


30 (22-36) 
18 (13-22) 
57 (42-64) 
103 (79-115) 
70 (51-84) 
21 (15-25) 


Disturbance gives the loss of conservation value due to disturbance in km?. Relative gives the disturbance-mediated loss of conservation value relative to that from forest loss. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


doi:10.1038/nature18621 


Allosteric inhibition of SHP2 phosphatase inhibits 
cancers driven by receptor tyrosine kinases 


Ying-Nan P. Chen!, Matthew J. LaMarche!, Ho Man Chan!, Peter Fekkes!, Jorge Garcia-Fortanet!, Michael G. Acker, 

Brandon Antonakos!, Christine Hiu-Tung Chen!, Zhouliang Chen!, Vesselina G. Cooke!, Jason R. Dobson!, Zhan Deng!, 

Feng Fei!, Brant Firestone!, Michelle Fodor!, Cary Fridrich!, Hui Gao!, Denise Grunenfelder!, Huai-Xiang Hao’, Jaison Jacob', 
Samuel Hol, Kathy Hsiao!, Zhao B. Kang!, Rajesh Karki!, Mitsunori Kato!, Jay Larrow!, Laura R. La Bonte, Francois Lenoir', 
Gang Liu!, Shumei Liu!, Dyuti Majumdar!, Matthew J. Meyer!, Mark Palermo!, Lawrence Perez!, Minying Pu!, Edmund Price’, 
Christopher Quinn!, Subarna Shakya!, Michael D. Shultz!, Joanna Slisz!, Kavitha Venkatesan!, Ping Wang', Markus Warmuth!, 
Sarah Williams!, Guizhi Yang, Jing Yuan!, Ji-Hu Zhang!, Ping Zhu!, Timothy Ramsey!, Nicholas J. Keen!, William R. Sellers', 


Travis Stams! & Pascal D. Fortin! 


The non-receptor protein tyrosine phosphatase SHP2, encoded by 
PTPN11, has an important role in signal transduction downstream 
of growth factor receptor signalling and was the first reported 
oncogenic tyrosine phosphatase’. Activating mutations of SHP2 
have been associated with developmental pathologies such as 
Noonan syndrome and are found in multiple cancer types, including 
leukaemia, lung and breast cancer and neuroblastoma!~>. SHP2 is 
ubiquitously expressed and regulates cell survival and proliferation 
primarily through activation of the RAS-ERK signalling 
pathway~”. It is also a key mediator of the programmed cell death 
1 (PD-1) and B- and T-lymphocyte attenuator (BTLA) immune 
checkpoint pathways’. Reduction of SHP2 activity suppresses 
tumour cell growth and is a potential target of cancer therapy®”. 
Here we report the discovery of a highly potent (ICs) = 0.071 1M), 
selective and orally bioavailable small-molecule SHP2 inhibitor, 
SHP099, that stabilizes SHP2 in an auto-inhibited conformation. 
SHP099 concurrently binds to the interface of the N-terminal SH2, 
C-terminal SH2, and protein tyrosine phosphatase domains, thus 
inhibiting SHP2 activity through an allosteric mechanism. SHP099 
suppresses RAS-ERK signalling to inhibit the proliferation of 
receptor-tyrosine-kinase-driven human cancer cells in vitro and is 
efficacious in mouse tumour xenograft models. Together, these data 
demonstrate that pharmacological inhibition of SHP2 is a valid 
therapeutic approach for the treatment of cancers. 

To discover new cancer therapeutic targets, a deep-coverage, pooled 
short hairpin RNA (shRNA) library targeting 7,500 genes with 20 
shRNAs per gene was screened across a panel of 250 cell lines from 
the Cancer Cell Line Encyclopedia (CCLE) (ref. 10 and Schlabach 
et al., manuscript in preparation). An unbiased correlation analysis 
was performed and revealed that cell lines sensitive to SHP2 depletion 
are most sensitive to EGFR depletion (P=4.10 x 10°). When a subset 
of cell lines dependent on known receptor tyrosine kinases (RTKs) 
(such as EGFR, ERBB2, c-MET, and FLT3) and FRS2-dependent lines 
were considered as a class, a marked correlation emerged with sen- 
sitivity to SHP2 depletion (Fig. la and Extended Data Fig. 1, Fisher's 
exact P< 4.45 x 107'4). These findings provide a robust cross- 
validation of reports that RTK-driven cancer cells depend on SHP2 for 
survival®?, Conversely, cell lines that were sensitive to KRAS, NRAS 
or BRAF depletion were refractory to SHP2 downregulation (Fig. 1b 
and Extended Data Fig. 1, Fisher’s exact P< 7.90 x 10°). To validate 
these findings further, doxycycline (dox)-inducible SHP2 shRNAs 
were introduced into cancer cells lines with distinct RTK alterations, 
including amplification of EGFR (MDA-MB-468, KYSE520), ERBB2 
(NCI-H2170), FGFR2 (SUM52 and KATO III), and EML4-ALK 


translocation (NCI-H2228). Consistent with the shRNA screening 
data, SHP2 depletion led to marked inhibition of colony formation 
in each of these RTK-dependent cancer cells (Extended Data Fig. 1a). 
Importantly, this was specific to RTK-dependent cell lines, as BRAF- 
and KRAS-mutated cells (A2058 and MDA-MB-231) showed no 
growth effect upon SHP2 depletion (Extended Data Fig. 1b). To 
evaluate the importance of SHP2 catalytic activity for the growth of 
sensitive cell lines, a complementation experiment was conducted by 
re-expressing shRNA-resistant alleles of wild-type SHP2 or the catalyt- 
ically inactive SHP2°*S variant in MDA-MB-468 cells. SHP2 deple- 
tion inhibited the growth of MDA-MB-468 accompanied by reduced 
p-ERK levels (Fig. 1c, Extended Data Fig. 1a, c). Upon dox treatment, 
wild-type SHP2 and SHP2“4°°S were expressed at similar levels. 
Expression of wild-type SHP2, but not SHP2“°"S restored p-ERK 
levels and cell growth (Fig. 1c, Extended Data Fig. 1c). Similar results 
were obtained with SUM52 cells (Extended Data Fig. 2). Therefore, 
SHP2 phosphatase activity is required for p-ERK activation and main- 
tenance of cell growth in RTK-driven cancers. 

On the basis of the shRNA screening results, we hypothesized that 
cells with constitutively activated RAS signalling would be insensitive 
to SHP2 inhibition. To test this hypothesis, SHP2-dependent SUM52 
cells were transduced with a lentivirus carrying the KRAS®" onco- 
gene. Expression of KRAS°!Y restored p-ERK levels and rendered 
these cells insensitive to SHP2 knockdown (Fig. 1d, Extended Data 
Fig. 1d). Furthermore, SHP2 depletion had no impact on cell growth 
and proliferation in MDA-MB-231 (KRAS°?°) or A2058 (BRAFY°®) 
cells (Extended Data Fig. 1b). These data strongly suggest that can- 
cer cells carrying oncogenic RAS/RAF mutations will be refractory to 
SHP2 inhibition. 

Efforts to discover small molecule therapeutics targeting protein 
tyrosine phosphatases (PTPs) have been challenged by the highly 
solvated and polar nature of the catalytic site, as exemplified by the 
SHP2 PTP domain!!~!°. To discover novel modes of phosphatase 
inhibition, we developed screening strategies aimed at identifying 
SHP2 allosteric inhibitors. SHP2 is activated by peptides and proteins 
containing appropriately spaced phospho-tyrosine residues that bind 
the N-terminal and C-terminal SH2 domains (denoted as N-SH2 
and C-SH2, respectively) in a bidentate manner, releasing the auto- 
inhibitory interface and making the active site available for substrate 
recognition and turnover!®'”. To discover inhibitors that could take 
advantage of this natural regulatory mechanism and lock SHP2 in 
an auto-inhibited conformation (Fig. 2a), a diverse library of 100,000 
compounds were screened at 201M against SHP2 (residues 1-525) 
that was partially activated using 0.5 1M of a bisphosphorylated IRS-1 


1Novartis Institutes for Biomedical Research, 250 Massachusetts Avenue, Cambridge, Massachusetts 02139, USA. 


00 MONTH 2016 | VOL 000 | NATURE | 1 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Figure 1 | Genetic validation of SHP2. 


a 
z b 1.5 
1.0 
os 0.5 
2 = 
= Cc 
cS oO 
3 é 
So Q 
oP) a 
ac < 
e q 
mas RTK dependent (n = 87) 
m= RTK independent (n = 283) 
Cell line 
¢ 154 d 
~ GFP —dox-A + dox 
2 SHP2 -dox-B + dox Control’ shRNA 
5 SHP2CHS  —dox-@- +dox -—@ | A 
@ ane 
fo} te 
e 1.04 gE k 
iS ” Ps 
7 ° —Dox 
oO 
2 
is 
2 
rc) +Dox 
a b 
80 2P-IRS-1 
+ = 
; g 
< 607 jnactive ™ i sore r- 
2 — 
& 3 
8 40 - 
a Active 3 
3 x 
ir o 
20 -@ SHp2PTP E 
# SHP2 Zz 
0.01 0.1 1 10 
2P-IRS-1 peptide concentration (uM) 
1504 , 0.3 
a -@ SHP043 
= -& SHP099 2 
= 1004 +0.2 U 
8 8 
oO o 
g P 
oO 
& 504 tO.1 = 
aa <= 
o 
no a > 
of a my + 0 
0.001 0.01 0.1 1 10 100 


2P-IRS-1 peptide concentration (uM) 


Figure 2 | Discovery of a SHP2 allosteric inhibitor. a, Schematic of SHP2 
allosteric activation by 2P-IRS-1, highlighting an allosteric inhibitor that 
blocks the activation of SHP2 via the enrichment of its auto-inhibited 
conformer and dose-dependent activation of SHP2 by 2P-IRS-1 peptide. 
SHP2, SHP2??? and a dimethyl sulfoxide (DMSO) control were incubated 
with increasing concentrations of 2P-IRS-1 peptide. Biochemical activity 
was monitored using the DiFMUP (6,8-difluoro-4-methylumbelliferyl 
phosphate) substrate and normalized against the basal activity determined 
for each condition in the absence of 2P-IRS-1. SHP2i, SHP2 inhibitor. 

b, Primary screen was performed using a 100,000-molecule library. 

SHP2 was screened in the presence of 0.5 1M 2P-IRS-1 and 20\1M of each 
compound. The red dotted line represents the 30% inhibition threshold. 
The red circles represent the six validated allosteric inhibitors. SHP836 is 
labelled, and inhibited SHP2 activity by 87.6% The Z’ factor determined 
for the screen was >0.9. c, Biochemical assay fingerprint observed with 


2 | NATURE | VOL 000 | 00 MONTH 2016 


mam RAS/RAF dependent (n = 114) 
mas RAS/RAF independent (n = 253) 


Cell line 


© Screening hits e Validated allosteric inhibitor o Inactive 


a, Waterfall plot showing the ATARIS Quantile 
score for SHP2 shRNAs coloured by the effect of 
RTK shRNA knockdown (ATARIS score <—1.0) 
in 370 cell lines. b, Waterfall plot showing the 
ATARIS Quantile score for SHP2 shRNAs 
coloured by the effect of shRNA knockdown 

of KRAS, NRAS or BRAF (ATARIS score 
<-—1.0) in 367 cell lines. c, Cell proliferation in 
MDA-MB-468 SHP2 knockdown cells stably 
expressing GFP, wild-type haemagglutinin 
(HA)-tagged SHP2 or HA~SHP2*°*S. Cells were 
treated with dox (100ng ml’) and cell growth 
was measured by CellTiter-Glo assay at the 
indicated times. Data presented as mean + s.d. 
(n=3). d, Colony formation of SHP2-depleted 
SUM5S2 cells stably expressing vector control 

or HA-~KRASSY. In c and d, dox treatment 
induces depletion of endogenous SHP2 protein 
and simultaneous expression of the exogenous 
proteins GFP, wild-type HA-SHP2, 
HA-SHP2“°S or HA-KRASSVY, 


PTPN17 shRNA 
—KRASS2V 4. KRASGIZV 


0 


-¢ SHP2PTP 
« SHP2 + 0.5 uM IRS-1 


So 

& 1207 & SHp2 +5 uM IRS-1 

= 1004 

& 80+ 

a 4 

& 60- 

3 404 Lehn 

oO 

N | H 

@ 20/ SHP836 + 

E SHP2 IC,,: 5.0 uM 

Z 0+ T 
0.01 


T T T 
0.1 1 10 100 


Inhibitor concentration (uM) 


SHP2 IC,,: 0.071 uM 


66% SHP099 
SN 


allosteric inhibitor SHP836. Inhibition of SHP2?"? and SHP2 in the 
presence of 0.5 41M 2P-IRS-1 and 541M 2P-IRS-1. d, SHP2 inhibition by 
SHP099 and PTP active-site inhibitor SHP043 in the presence of various 
2P-IRS-1 concentrations. e, Chemical structure of SHP099 and X-ray 
crystal structure of SHP2 in complex with SHP099 (PDB accession 
number 5EHR). Surface representation of SHP2 in complex with SHP099 
bound in the central tunnel formed at the interface of the three domains 
(green, N-SH2; blue, C-SH2; tan, PTP domain). SHP2 is in the inactive 
conformation with the N-terminal SH2 domain fully occluding the entry 
of substrate to the active-site cysteine (shown in red). f, Key interactions 
between SHP099 and all three domains of SHP2 highlighted, including 
hydrogen bonds with Arg111 (N-SH2), Phe113 (C-SH2), and Glu250 
(PTP). Data points along the line in a, c and d represent the mean of two 
replicate values. 


© 2016 Macmillan Publishers Limited. All rights reserved 


peptide (2P-IRS-1). Nine hundred compounds were found to inhibit 
the enzyme by 30% or more (Fig. 2b) and further profiled in three 
distinct biochemical assays: (1) using a truncated form of SHP2 with 
the PTP domain only (SHP2°"), or SHP2 assayed with (2) partially 
and (3) fully activating levels of 2P-IRS-1. Compounds that inhibited 
only the phosphatase domain were deprioritized to enrich for potential 
allosteric inhibitors. Six compounds, exemplified by SHP836, demon- 
strated no inhibition against SHP2?!”, moderate inhibitory activity 
(IC59 = 5-50 1M) against SHP2 activated by 0.5|1M 2P-IRS-1, and 
reduced inhibitory activity in the presence of a higher concentration of 
2P-IRS-1 (Fig. 2c). SHP836 was further optimized to SHP099, yielding 
a >70-fold improvement in biochemical potency to ICs) = 0.071 1M 
(manuscript in preparation). Furthermore, SHP099 showed no 
detectable activity against a panel of 21 phosphatases and 66 kinases'® 
(Extended Data Tables 1 and 2), and only had modest activity against 
5HT3 when profiled against a preclinical safety pharmacology panel 
representing 49 common adverse drug reaction targets’? (Extended 
Data Table 3). Importantly, SHP099 showed no activity against SHP1, 
the closest homologue of SHP2 sharing 61% amino acid sequence 
identity, supporting its high degree of target selectivity. 

To understand the mechanism of inhibition, we determined 
the effect of the 2P-IRS-1 peptide on the potency of SHP099 and 
compared it to the active site inhibitor SHP043, stemming from a 
previously reported class of PTP1B inhibitors” (Fig. 2d). An increase 
in 2P-IRS-1 peptide correlated with a tenfold decrease in SHP099 
potency, as opposed to a sixfold increase in SHP043 potency across the 
same range of 2P-IRS-1 peptide concentrations (Fig. 2d). These data 
suggest that SHP099 is interfering with the 2P-IRS-1-driven activation 
of SHP2, and that the active site of the PTP is more readily accessible 
for SHP043 binding in the activated conformation of SHP2. In addi- 
tion, isothermal titration calorimetry revealed that SHP099 bound to 
SHP2 with 1:1 stoichiometry with a measured dissociation constant 
of 0.073 1M (K=1.38 x 10’ M7", Extended Data Fig. 3a). 

To distinguish further the mechanism of inhibition from catalytic 
site inhibitors, we solved the crystal structure of SHP099 in complex 
with SHP2 (resolution, 1.7A) (Extended Data Table 4). Here, SHP2 
was found in the same auto-inhibited, inactive conformation as the 
reported apo-SHP2 structure!’, with the N-terminal SH2 domain 
blocking the active site. Our structure revealed SHP099 bound to 
the central tunnel formed at the interface of the N-SH2, C-SH2, and 
PTP domains (Fig. 2e) and interactions between SHP099 and all three 
domains of SHP2, strongly suggesting that SHP099 inhibits the cata- 
lytic activity through stabilization of the inactive conformation of the 
enzyme. Key interactions include hydrogen bonds with Arg111 and 
Phe113 located on the linker between the N-SH2 and C-SH2 domains, 
as well as Glu250 from the PTP domain. Additionally, the dichloro- 
phenyl group of SHP099 makes extensive hydrophobic interactions 
with the sidechains of Leu254, Gln257, Pro491, and Gln495 of the 
PTP domain (Fig. 2f). In the homologue SHP1, repositioning of the 
linker between the two SH2 domains would remove key SHP099 inter- 
actions (highlighted by residue Arg109 in SHP1 and Arg111 in SHP2; 
Extended Data Fig. 3b-d), and would yield a significantly larger cen- 
tral tunnel with an estimated volume of 1,012 A? compared to 464 AB 
for SHP2 and to the 262 A? volume of SHP099. These observations 
probably explain the selectivity of SHP099 for SHP2 over SHP1. 

To determine whether SHP099 was capable of cellular SHP2 inhi- 
bition, cells were treated with increasing concentrations of SHP099. 
SHP099 inhibited p-ERK with an ICso of ~0.25 1M in SHP2- 
dependent MDA-MB-468 and KYSE520 cells, but not in A2058 cells 
(Fig. 3a). No effect was observed on p-AKT levels across the same 
cells (Extended Data Fig. 4a). The inhibition of p-ERK was consist- 
ent with the growth inhibition observed in a colony-formation assay 
(Extended Data Fig. 4b). The inhibition of KYSE520, MDA-MB-468 
and A2058 cells by SHP099 was also assessed in a cell proliferation 
assay and extended to three additional SHP2-dependent haemato- 
poietic cell lines, MV-411, MOLM-13 and Kasumi-1, resulting in the 


LETTER 


a b 

= 150 = 

a = fe} ®@ RTK-altered (* = JAK1/2 alteration) 

oO ao 

g fa 

§ saa —— 5 -20 MV-411 

x Q MONO-MAC-1 

wi -- KYSE520 = -40 / Kasumi-6 

a -e A2058A rs EOL-1 { \ 

3B 50 -@ MDA-MB-468  & -60 | KER s 

N zg m 

3 @ GDM-1 

é & 60 V2; Ni 

io} € . 

2 0 = -100] @ MOLM-13 
0.01 04 1 10 3 


0 5 10 15 20 25 30 35 


Concentration (uM) Absolute IC, (uM) 
‘50 


0 
[-5 


-& SHP2 WT 


50 O SHP2T253M/a257L 


Normalized SHP2 activity (%) 


0.01 O04 1 10 100 
SHPO99 (uM) 
f 
e g 100 
KYSE520_ SBP-SHP2 WT SBP-SHP2TseM/ars 
- 1310-1340 -1 3 10 (uM 5 
2 60 
SHE. a CEA = 
- —— < 40 7 esHP2 wr 
p-ERK lage - é 40 _ BSHP2T=sawezsr 
ERK =sazea3w SR of : ~ 
SHPO99 (uM) 


Figure 3 | Validation of SHP2-dependent inhibitory activity of SHP009 
in cells. a, Inhibition of p-ERK activity by SHP099 in A2058, KYSE520 
or MDA-MB-468 cells assayed by SureFire p-ERK assay. p-ERK activity 
is expressed as a percentage of the DMSO control. Data presented as 
mean + s.d. (n = 3). b, Activity of SHP099 in 71 haematopoietic cell lines. 
The data are plotted as normalized inhibition at 30 1M SHP099 (y axis) 
against calculated absolute ICs9 values of SHP099 for each cell lines 

(x axis). Solid red circles represent cancer cells with RTK, JAK1 or JAK2 
mutations, black circles correspond to NRAS- or KRAS-mutated cells. 
The corresponding data and cell line genotypes are in Supplementary 
Information Table 1. c, Model of engineered SHP2179M/2257L mutant 
highlighting steric clashes between the mutated residues and SHP099. 

d, Biochemical inhibition of wild-type SHP2 and SHP27?3/57L by 
SHP099. e, Western blot of SHP2, p-ERK and ERK from SHP2-depleted 
KYSE520 cells stably re-expressing SBP-tagged wild-type SHP2 or 
SBP-SHP2!23M/Q257L and treated with SHP099 (1, 3, 10}1M). 

f, Proliferation of SHP2-depleted KYSE520 cells stably re-expressing 
wild-type streptavidin-binding peptide tagged (SBP) SHP2 or SBP- 
SHP21!753M/Q257L treated with SHP099 (1.25, 2.5, 5, 10j1M). Data points 
along the line in d and f represent the mean of two replicate values. 


expected cell growth inhibition (Extended Data Fig. 4c). We further 
profiled SHP099 in a panel of 71 haematopoietic cancer cell lines and 
26 colorectal cancer cell lines. Haematopoietic cancer cells with known 
alterations in oncogenic RTKs or other tyrosine kinases such as JAK1 
or JAK2 were sensitive to SHP099 inhibition (Fig. 3b, Supplementary 
Information Table 1). Similarly, colorectal cancer cells that were 
sensitive to the potent Herl/2 and EGFR inhibitor Lapatinib, and 
hence dependent on EGFR signalling, also responded to SHP099 treat- 
ment (Extended Data Fig. 4d, Supplementary Information Table 2). 
In contrast, RAS- or BRAF-mutated cells from both lineages were gen- 
erally resistant to SHP099 treatment (ICs9 > 30 1M for haematopoietic 
lines and >20 1M for colorectal lines) (Fig. 3b, Extended Data Fig. 4d 
and Supplementary Table 2). The observed correlation between RTK 
dependence and SHP099 sensitivity is robustly supported by a Chi- 
squared test of independence (P< 1.1 x 10~'°). These data therefore 
recapitulate the differential growth inhibitory effects observed in the 
shRNA screen and suggest a strong association between RTK depend- 
ence and sensitivity to SHP2 inhibition. 

To determine whether SHP099-mediated suppression of MAPK 
signalling and growth inhibition was an on-target consequence of 


00 MONTH 2016 | VOL 000 | NATURE | 3 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 
1,500, 
a -m- Control shRNA - dox 
7 1,250) 3 Control shRNA + dox 
© | -e- PTPN11 shRNA — dox 
E 1,000, 5. PTPN11 shRNA + dox 
S 7504 
3 
e 5004 
= 
250+ 
ot r r 1 
10 15 20 25 
Time after implantation (d) 
~~ 1,2007 5 Vehicle 
. 1,000 4 oe SHP099 
& -#- Erlotinib 
o 
= 8004 
2 
Q 6004 
= 
o 4004 
E | 
F 2004 
0+, T T T T 1 
10 13 16 #19 22 25 


Time after implantation (d) 


Figure 4 | In vitro and in vivo characterization of SHP099. a, SHP2 
knockdown inhibits the growth of established KYSE520 xenograft 
tumours in vivo. KYSE520 cells stably expressing dox-inducible 
non-targeting control or PTPN11 shRNA were inoculated into mice. 
Mice were treated with vehicle (dotted lines) or dox-supplemented diet 
(solid lines) starting 10 days after implantation. The tumour volume of 
vehicle or dox-treated mice is plotted as the mean +s.e.m. (n=9). b, 
In vivo plasma SHP099 concentration and xenograft p-ERK levels 
following a single oral administration of SHP099 (100 mg per kg) or 
erlotinib (80 mg per kg) to nude mice with subcutaneous KYSE520 
xenografts. SHP099 plasma concentrations and xenograft p-ERK level 
were assessed through the first 24h following compound administration. 


pharmacological inhibition of SHP2, inhibitor-resistant alleles were 
developed. From the co-crystal structure, we hypothesized that muta- 
tion of Thr253 and Gln257 would disrupt SHP099 binding, but main- 
tain the integrity of the three-domain regulatory interface (Fig. 3c). 
After testing single- and double-mutant alleles, SHP2'?°?M/Q5/!. was 
found to retain the catalytic activity and auto-inhibited basal state 
of SHP2, but was 1,000-fold less sensitive to SHP099 inhibition as 
compared to wild-type SHP2 in vitro (Fig. 3d, Extended Data Fig. 3d). 
Treatment of KYSE520 cells expressing SHP2'7°3/957! with SHP099 
failed to inhibit both the signalling to p-ERK and cellular growth 
compared to the KYSE520 wild-type SHP2 control (Fig. 3e, f). These 
data strongly suggest that SHP099 inhibits MAPK signalling and 
proliferation in RTK-dependent cells through direct on-target 
inhibition of SHP2. 

We next established a subcutaneous xenograft model using 
KYSE520 stably transduced with dox-inducible PTPN11 shRNA to 
investigate whether SHP2 was required for tumour maintenance 
in vivo. Expression of PTPN11 shRNA was induced by dox when the 
tumour volume reached ~200mm/?. SHP2 knockdown led toa signifi- 
cant reduction in p-ERK levels and marked tumour growth inhibition 
(P < 0.05), whereas a control non-targeting shRNA showed no effect 
(Fig. 4a and Extended Data Fig. 5a, b). 

On the basis of its potent effects in cell culture, we next evaluated 
the efficacy of SHP099 in nude mice with established, subcutaneous 
KYSE520 xenografts. Following a single 100 mg per kg (body weight) 
oral dose, SHP099 yielded free plasma concentrations >10 1M and was 
associated with a robust inhibition of p-ERK (>50%) that was main- 
tained for 24h after the dose was administered (Fig. 4b). At this expo- 
sure, SHP099 was predicted to achieve a significant anti-proliferative 
effect on the basis of in vitro characterization (Fig. 3a, b and Extended 


4 | NATURE | VOL 000 | 00 MONTH 2016 


-6- SHP099 (p-ERK/ERK) 
~4— Erlotinib (p-ERK/ERK) 


100,000 ,-®- SHP099 plasma concentration _ 4.5 
j pe eee ri 
= 10,000 4 of ~e----- e oe 
& + 1.0 HS 
3 RS 
2 1,000 4 mo 
B L 0.5 = a 
100 4 S| 
a 
j o) 
10+—— 1 + + 0.5 
13 7 16 24 
Time after dose (h) 
d 
@1004 -®Vehicle 
8 ~©- SHP099 
5 to 805 
oy 
58 60, 
ae 
ms) 
= 
on) 4 
Eo 40 | 
“ @ 204 
® 
fo 1 s = 
40 50 60 70 


Time after implantation (d) 


Data are plotted as the mean +s.e.m. (n = 3). c, Antitumour efficacy of 
SHP099 (100 mg per kg daily) or erlotinib (80 mg per kg daily) in nude 
mice bearing established subcutaneous KYSE520 xenografts. Mice were 
administered compounds daily by oral gavage starting 10 days after cell 
implantation. Data are plotted as the group mean +s.e.m. (n= 9). 

d, Antitumour efficacy SHP099 (75 mg per kg) in immunocompromised 
mice with an orthotopic, primary-tumour-derived AML xenograft. Mice 
were administered SHP099 daily by oral gavage starting 35 days after 
tumour implantation and continued for 34 days. Tumour burden was 
assessed by FACS detection of hCD45* leukaemic cells in circulation. 
Data are plotted as the group mean +s.e.m. (1 =7). The arrow in a, c and 
d denotes the start of SHP099 treatment. 


Data Fig. 4a, b). SHP099 was administered by oral gavage at 100 mg 
per kg daily to nude mice with KYSE520 xenografts and yielded 
marked tumour growth inhibition (Fig. 4c) over a 24-day time 
period. In a follow-up study, orally administered SHP099 showed 
dose-dependent anti-tumour activity in the KYSE520 xenograft 
model and was well tolerated, as demonstrated by insignificant or no 
body weight loss over the entire course of treatment (Extended Data 
Fig. 5c, d). For comparison, we treated a parallel cohort of KYSE520 
tumour-bearing mice with the EGFR inhibitor erlotinib. The p-ERK 
modulation and tumour growth inhibition observed with erlotinib 
was equivalent to that observed with SHP099 (Fig. 4b, c). To extend 
this observation to the setting of RTK activation in haematologi- 
cal malignancies, SHP099 was evaluated in an orthotopic human- 
primary-tumour-derived FLT3-ITD acute myeloid leukaemia (AML) 
model. Here, daily dosing at 75 mg per kg led to near-complete erad- 
ication of circulating human (h)CD45* leukaemic cells (Fig. 4d) 
and significantly reduced splenomegaly in the mice (Extended Data 
Fig. 5e, f). In summary, pharmacological inhibition of SHP2 by 
SHP099 is efficacious and well tolerated and therefore offers a novel 
therapeutic approach to target RIK-dependent cancers. 

Despite two decades of research describing the central role of SHP2 
in developmental and oncogenic signalling pathways, no SHP2 inhib- 
itor has progressed to clinical use. Although catalytic SHP2 inhibitors 
have been described!'~"», they are typically of low potency and inhibit 
other phosphatases. Although the allosteric inhibition of a metal- 
dependent serine/threonine phosphatase family has been explored”!, 
SHP099 is the first example of a potent, selective and orally bioavail- 
able allosteric PTP inhibitor specific to SHP2 that is efficacious and 
well tolerated in patient-derived tumour xenograft models. Our study 
provides evidence that pharmacological inhibition of SHP2 is a viable 


© 2016 Macmillan Publishers Limited. All rights reserved 


strategy to target R[K-driven cancers and presents a new chemical 
tool for further interrogation of the multifaceted cellular functions 
of SHP2 in development, tumorigenesis, RTK-driven drug resistance 
and immune-checkpoint modulation. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 5 November 2015; accepted 26 May 2016. 
Published online 29 June 2016. 


1. Grossmann, K. S., Rosario, M., Birchmeier, C. & Birchmeier, W. The tyrosine 
phosphatase Shp2 in development and cancer. Adv. Cancer Res. 106, 53-89 
(2010). 

2. Chan, R. J. & Feng, G. S. PTPN11 is the first identified proto-oncogene that 
encodes a tyrosine phosphatase. Blood 109, 862-867 (2007). 

3. Matozaki, T., Murata, Y., Saito, Y., Okazawa, H. & Ohnishi, H. Protein tyrosine 
phosphatase SHP-2: a proto-oncogene product that promotes Ras activation. 
Cancer Sci. 100, 1786-1793 (2009). 

4. Mohi, M. G. & Neel, B. G. The role of Shp2 (PTPN11) in cancer. Curr. Opin. 
Genet. Dev. 17, 23-30 (2007). 

5. Ostman, A., Hellberg, C. & Bohmer, F. D. Protein-tyrosine phosphatases and 
cancer. Nat. Rev. Cancer 6, 307-320 (2006). 

6. Gavrieli, M., Watanabe, N., Loftin, S. K., Murphy, T. L. & Murphy, K. M. 
Characterization of phosphotyrosine binding motifs in the cytoplasmic domain 
of Band T lymphocyte attenuator required for association with protein 
tyrosine phosphatases SHP-1 and SHP-2. Biochem. Biophys. Res. Commun. 
312, 1236-1243 (2003). 

7. Okazaki, T., Chikuma, S., lwai, Y., Fagarasan, S. & Honjo, T. A rheostat for 
immune responses: the unique properties of PD-1 and their advantages for 
clinical application. Nat. /mmunol. 14, 1212-1218 (2013). 

8. Prahallad, A. et al. PTPN11 is a central node in intrinsic and acquired 
resistance to targeted cancer drugs. Cell Reports 12, 1978-1985 (2015). 

9. Schneeberger, V. E. et al. Inhibition of Shp2 suppresses mutant EGFR-induced 

lung tumors in transgenic mouse model of lung adenocarcinoma. Oncotarget 
6, 6191-6202 (2015). 

0. Barretina, J. et al. The Cancer Cell Line Encyclopedia enables predictive 
modelling of anticancer drug sensitivity. Nature 483, 603-607 (2012). 

1. Scott, L. M. et al. Shp2 protein tyrosine phosphatase inhibitor activity of 
estramustine phosphate and its triterpenoid analogs. Bioorg. Med. Chem. Lett. 
21, 730-733 (2011). 

2. Grosskopf, S. et al. Selective inhibitors of the protein tyrosine phosphatase 
SHP2 block cellular motility and growth of cancer cells in vitro and in vivo. 
ChemMedChem 10, 815-826 (2015). 

3. He, R. et al. Exploring the existing drug space for novel pTyr mimetic and SHP2 
inhibitors. ACS Med. Chem. Lett. 6, 782-786 (2015). 

4. Hellmuth, K. et a/. Specific inhibitors of the protein tyrosine phosphatase Shp2 
identified by high-throughput docking. Proc. Nat! Acad. Sci. USA 105, 
7275-7280 (2008). 

5. Zeng, L. F. et a/. Therapeutic potential of targeting the oncogenic SHP2 
phosphatase. J. Med. Chem. 57, 6594-6609 (2014). 

6. Pluskey, S., Wandless, T. J., Walsh, C. T. & Shoelson, S. E. Potent stimulation of 
SH-PTP2 phosphatase activity by simultaneous occupancy of both SH2 
domains. J. Biol. Chem. 270, 2897-2900 (1995). 


LETTER 


17. Hof, P., Pluskey, S., Dhe-Paganon, S., Eck, M. J. & Shoelson, S. E. 

Crystal structure of the tyrosine phosphatase SHP-2. Cel! 92, 441-450 
(1998). 

18. Manley, P. W. et a/. Extended kinase profile and properties of the 
protein kinase inhibitor nilotinib. Biochim. Biophys. Acta. 1804, 445-453 
(2010). 

19. Bender, A. et al. Analysis of pharmacology data and the prediction of adverse 
drug reactions and off-target effects from chemical structure. ChemMedChem 
2, 861-873 (2007). 

20. Szczepankiewicz, B. G. et al. Discovery of a potent, selective protein tyrosine 
phosphatase 1B inhibitor using a linked-fragment strategy. J. Am. Chem. Soc. 
125, 4087-4096 (2003). 

21. Gilmartin, A. G. et al. Allosteric Wip1 phosphatase inhibition through 
flap-subdomain interaction. Nat. Chem. Biol. 10, 181-187 (2014). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements Use of the IMCA-CAT beamline 17-ID at the Advanced 
Photon Source was supported by the companies of the Industrial 
Macromolecular Crystallography Association through a contract 

with Hauptman-Woodward Medical Research Institute. Use of the 
Advanced Photon Source was supported by the US Department of 
Energy, Office of Science, Office of Basic Energy Sciences, under contract 
number DE-ACO2-06CH11357. 


Author Contributions Y.P.C., F.F., H.-X.H., K.H., S.L., J.S., P.Z., H.M.C. performed 
or directed cellular assays data generation and analysis; P.F., M.G.A., 

Z.B.K., S.H., E.P., C.Q., S.S, P.W., J.-H.Z. and P.D.F. performed or directed 
biochemical experiments; B.A., V.G.C., B.F., H.G., L-R.L.B., M.J.M., M.P., G.Y. 
and J.Y. performed or directed in vivo pharmacology or pharmacokinetic/ 
pharmacodynamics experiments and data analysis; J.R.D. and K.V. directed 
or performed bioinformatics analyses; M.J.L., J.G.-F., C.F, C.H.-T.C, Z.C., 

D.G., R.K., M.K., J.L., FL, G.L., D.M., M.P.,, L.P., M.D.S., T.S., S.W. Designed, 
synthesized and/or directed the design or synthesis of SHP2 inhibitors; 
Z.D., M.K., and S.W. performed protein and inhibitor structural modelling 
or cheminformatics analyses; M.F., JJ. and T.S. designed, directed or 
performed biophysics experiments; M.F. and T.S. directed or performed 
x-ray crystallography experiments; Y.P.C., J.R.D., L.R.L.B., M.F., M.J.M., 

K.V., H.M.C., T.S., W.R.S. and P.D.F. prepared figures and tables for the main 
text and Supplementary Information; Y.P.C., M.J.L., J.R.D., L-R.L.B., M.J.M., 
K.V., N.J.K., H.M.C., T.S., W.R.S. and P.D.F. wrote and edited the main text 
and Supplementary Information; P.D.F., N.J.K., T.R., T.S., W.R.S. and M.W. 
contributed to overall project oversight. 


Author Information Atomic coordinates and structure factors for the SHP2- 
SHP099 binary complex structure have been deposited with the Protein 
Data Bank under accession number 5EHR. Reprints and permissions 
information is available at www.nature.com/reprints. The authors declare 
competing financial interests: details are available in the online version of the 
paper. Readers are welcome to comment on the online version of the paper. 
Correspondence and requests for materials should be addressed to 

P.D.F. (pafortin@gmail.com), T.S. (travis.stams@novartis.com) or W.R.S. 
(william.sellers@novartis.com) . 


Reviewer Information Nature thanks B. Neel and the other anonymous 
reviewer(s) for their contribution to the peer review of this work. 


00 MONTH 2016 | VOL 000 | NATURE | 5 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


METHODS 


Data reporting. No statistical methods were used to predetermine sample size. 
The experiments were not randomized and the investigators were not blinded to 
allocation during outcome assessment. 
Bioinformatics. Statistical analyses were performed as follows: cell counts from 
the pooled shRNA experiments were normalized using a quantile normalization 
procedure as described elsewhere” and normalized scores for shRNA targeting 
the same gene were aggregated at the gene level using the ATARIS algorithm”. 
ATARIS scores for genes of interest were binned into three categories, ‘dependent, 
‘independent and ‘unclear’ on the basis of the degree of dropout. Most RTKs show 
a strong phenotype in these shRNA assays, here cell lines with ATARIS score 
<-—1 for any of the RTK genes (EGFR, FRS2, cMET, ERBB2, FLT3) were con- 
sidered RTK-dependent, cell lines with ATARIS score >0 were considered to be 
‘RTK-independent and cell lines with ATARIS scores in between 0 and —1 were 
considered ‘unclear’ and removed from statistical analyses to remove weaker effects. 
An identical approach was used for comparing with RAS/RAF (BRAEF, NRAS, and 
KRAS). SHP2 shows a less marked shRNA phenotype in these shRNA assays and 
the ATARIS score threshold for assigning SHP2-dependence was set to <—0.8. 
Association analyses were done using a Fisher’s exact test to assess (1) the 
association of SHP2-dependence with RTK-dependence and (2) the association 
of SHP2-dependence with RTK-alterations, and the odds ratio and P value were 
reported. The statistical analysis for the association of RTK-dependence and 
SHP099 sensitivity was performed as follows. In cell lines derived from haema- 
tological cancers, we defined RTK-dependence as RTK-altered/driven growth 
and in CRC lines as lapatanib sensitivity (IC59 < 0.5 1M). The P value was derived 
using the following relationships observed between RTK-dependence and sen- 
sitivity to SHP099 (ICs9 < 101M) in all 97 lines: 80 lines are SHP099 insensi- 
tive (69 expected based on RTK independence and RAS/RAF mutation status), 
0 lines are RTK-dependent and not SHP099 sensitive (10 expected), 4 lines are 
RTK-independent and SHP099 sensitive (14 expected), and 13 lines are both 
RTK-dependent and SHP099 sensitive (2 expected). 
Cell culture, viral production and infection. Human cancer cell lines originated 
from the CCLE, authenticated by single-nucleotide polymorphism analysis and 
tested for mycoplasma infection!®. Sum52, KYSE520, MDA-MB-468, KATO 
III, NCI-H2170, NCI-H2170, NCI-H2228, MDA-MB-231 and A2058 were cul- 
tured in RPMI medium (Invitrogen) supplemented with 10% fetal bovine serum 
(Invitrogen). H293 cells were grown in DMEM medium (Invitrogen) supple- 
mented with 10% fetal bovine serum. For viral production, vectors pshSHP2 or 
pshNTC (nontargeting control) were transfected in H293 cells using TransIT-293 
transfection reagent (Mirus Inc.), following the manufacturers protocol. At 72h 
after transfection, the cell culture medium was filtered through a 0.45 1m filter, 
and the viral supernatant supplemented with 8 1g ml! of polybrene was used for 
the infection of cells. For viral infection, ~70% confluent cells in six-well dishes 
were infected with virus at a multiplicity of infection of 5 units per cell for 4h 
and allowed to recover for 24h with fresh medium. Stable clones were selected 
using either puromycin or G418. Methods for profiling small molecule inhibitors 
in the haematopoietic cells are described elsewhere’. Cells were treated using 
SHP099 concentration varying from 0 to 301M. Cellular viability was measured 
using CellTiter-Glo. 
SHP2 allosteric inhibition assay. SHP2 is allosterically activated through bind- 
ing of bistyrosylphorphorylated peptides to its Src Homology 2 (SH2) domains. 
The latter activation step leads to the release of the auto-inhibitory interface 
of SHP2, which in turn renders the SHP2 protein tyrosine phosphatase (PTP) 
active and available for substrate recognition and reaction catalysis. The cata- 
lytic activity of SHP2 was monitored using the surrogate substrate DiFMUP ina 
prompt fluorescence assay format. More specifically, the phosphatase reactions 
were performed at room temperature in 384-well black polystyrene plate, flat 
bottom, low flange, non-binding surface (Corning) using a final reaction vol- 
ume of 251] and the following assay buffer conditions: 60 mM HEPES, pH 7.2, 
75mM NaCl, 75 mM KCI, 1 mM EDTA, 0.05% P-20, 5mM DTT. 0.5nM of SHP2 
was co-incubated with of 0.5 41M of bisphosphorylated IRS1 peptide (sequence: 
H2N-LN(pY)IDLDLV(dPEG8)LST(pY) ASINFQK-amide) and 0.003-100 1M 
of the inhibitory compounds. After 30-60 min incubation at 25°C, the surro- 
gate substrate DiFMUP (Invitrogen) was added to the reaction and incubated 
at 25°C for 30 min. The reaction was then quenched by the addition of 511 of 
a 160M solution of bpV(Phen) (Enzo Life Sciences). The fluorescence signal 
was monitored using a microplate reader (Envision, Perkin-Elmer) using excita- 
tion and emission wavelengths of 340 nm and 450 nm, respectively. The inhibitor 
dose-response curves were analysed using normalized ICso regression curve 
fitting with control-based normalization. 
Protein expression and purification. Two constructs of human SHP2 (accession 
number NP_002825.3) were generated by cloning the PTPN11 gene encoding 


truncations Met1-Leu525 (named SHP2) and Ala237-Ile529 (named SHP?"?) 
into a pET30 vector. A coding sequence for a 6x histidine tag followed by a 
tomato etch virus (TEV) protease consensus sequence was added 5’ to the con- 
structs sequence. The construct was transformed into BL21 Star (DE3) cells and 
grown at 37°C in Terrific Broth containing 100j.gml~! kanamycin. At an OD¢00 
of 4.0, SHP2 expression was induced using 1 mM IPTG. Cells were collected after 
overnight growth at 18°C. 

Cell pellets were resuspended in lysis buffer containing 50 mM Tris-HCl 
(pH 8.5), 25 mM imidazole, 500 mM NaCl, 2.5mM MgCh, 1mM TCEP, lj.gml-! 
DNasel, and complete EDTA-free protease inhibitor and lysed using a micro- 
fluidizer, followed by ultracentrifugation. The supernatant was loaded onto a 
HisTrap HP chelating column in 50 mM Tris-HCl, 25 mM imidazole, 500 mM 
NaCl, 1mM TCEP and protein was eluted with the addition of 250 mM imidazole. 
The N-terminal histidine tag was removed with an overnight incubation using 
TEV protease at 4°C. The protein was subsequently diluted to 50 mM NaCl with 
20mM Tris-HCl (pH 8.5), 1mM TCEP then applied to a HiTrap Q FastFlow 
column equilibrated with 20 mM Tris (pH 8.5), 50mM NaCl, 1 mM TCEP. The 
protein was eluted with a 10-column volume gradient from 50-500 mM NaCl. 
Fractions containing SHP2 were pooled and concentrated then loaded onto a 
HiLoad Superdex200 PG 16/100 column, exchanging the protein into the crys- 
tallization buffer, 20 mM Tris-HCl (pH 8.5), 150 mM NaCl and 3mM TCEP. The 
protein was concentrated to 15 mgml ’ for use in crystallization. QuickChange 
mutagenesis (Agilent) was used to generate the SHP215°M/2257l mutant using the 
above construct, and same procedure for expression and purification. 
Chemistry. All solvents employed were commercially available ‘anhydrous’ grade, 
and reagents were used as received unless otherwise noted. A Biotage Initiator 
Sixty system was used for microwave heating. Flash column chromatography was 
performed on either an Analogix Intelliflash 280 using Si 50 columns (32-63 1m, 
230-400 mesh, 60 A) or ona Biotage SP1 system (32-63 um particle size, KP-Sil, 
60A pore size). Preparative high pressure liquid chromatography (HPLC) was 
performed using a Waters 2525 pump with 2487 dual wavelength detector and 
2767 sample manager. Columns were Waters C18 OBD 51m, either 50 x 100mm 
Xbridge or 30 x 100 mm Sunfire. NMR spectra were recorded on a Bruker AV400 
(Avance 400 MHz) or AV600 (Avance 600 MHz) instruments. Analytical LC-MS 
was conducted using an Agilent 1100 series with UV detection at 214nm and 
254nm, and an electrospray mode (ESI) coupled with a Waters ZQ single quad 
mass detector. One of two methods was used: (1) 311 of sample was injected on an 
inertisil C8 3cm x 5mm x 3m and eluted using a 5:95 to 95:5 acetonitrile:H,O 
(5mM ammonium formate) 2 min gradient; (2) 311 of sample was injected on an 
inertisil C8 3cm x 5mm x 31m and eluted using a 20:80 to5:95 acetonitrile:H,O 
(10mM ammonium formate) 2 min gradient. The purity of all tested compounds 
was determined by LC/ESI-MS data recorded using an Agilent 6220 mass spec- 
trometer with electrospray ionization source and Agilent 1200 liquid chroma- 
tography. The mass accuracy of the system has been found to be <5 ppm. HPLC 
separation was performed at 75 ml min! flow rate with the indicated gradient 
within 3.5 min with an initial hold of 10s. 10 mM ammonia hydroxide or 0.1M 
TFA was used as the modifier additive in the aqueous phase. All compounds were 
found to be >95% purity. 

SHP836 originated from a purchased chemical library included in the gen- 
eral Novartis screening chemical library. SHP836 is also known as GW286103 
(ref. 24). SHP043 was synthesized as previously described”>. 

6-(4-amino-4-methylpiperidin- 1 -yl)-3-(2,3-dichlorophenyl)pyrazin-2-amine 
(SHP099). A mixture of 3-bromo-6-chloropyrazine-2-amine (1.5 g, 7.2 mmol), 
(2,3-dichlorophenyl)boronic acid (1.37 g, 7.2 mmol), PdCl2(dppf). DCM 
adduct (294 mg, 0.36 mmol), and potassium phosphate (4.58 g, 21.59 mmol) in 
MeCN:H,0 (9:1, 15 ml) was stirred for 4h at 120°C. After cooling to room tem- 
perature, the reaction mixture was filtered through a pad of Celite followed by 
EtOAc wash. The solvent was removed under reduced pressure and the resulting 
residue was purified by silica chromatography (5 to 30% gradient of EtOAc in 
heptane) to give 6-chloro-3-(2,3-dichlorophenyl)pyrazin-2-amine (1.46 g, 
5.32 mmol) as yellow solid. 'H NMR (400 MHz, chloroform-d) 6 ppm 7.99-8.08 
(m, 1H), 7.62 (dd, J=7.78, 1.76 Hz, 1 H), 7.36-7.42 (m, 1H), 7.32-7.36 (m, 1H), 
4.69 (br s, 2H). !3C NMR (101 MHz, chloroform-d) éppm 151.58, 146.66, 136.56, 
136.30, 134.24, 132.13, 131.78, 131.56, 129.34, 128.32. HRMS calculated for 
CioH,ClsN3 (M+H)* 273.9706, found 273.9706. 

A mixture of 6-chloro-3-(2,3-dichlorophenyl)pyrazin-2-amine (125 mg, 
0.455 mmol), tert-butyl (4-methylpiperidin-4-yl)carbamate (195 mg, 
0.911 mmol), and potassium phosphate (97 mg, 0.455 mmol) in NMP (1 ml) 
was stirred for 36h at 140°C. After cooling to room temperature, the mixture was 
poured into a separation funnel containing saturated aqueous NH,Cl and it 
was extracted with EtOAc (3 x 5 ml). The combined organic phases were dried 
over MgSO,, filtered and the solvents were removed under reduced pressure. The 


© 2016 Macmillan Publishers Limited. All rights reserved 


resulting residue was by silica chromatography (5 to 30% gradient of EtOAc in 
heptane) to give tert-butyl (1-(6-amino-5-(2,3-dichlorophenyl)pyrazin-2-yl)- 
4-methylpiperidin-4-yl)carbamate (113 mg, 0.250 mmol) as yellow solid. 'H 
NMR (400 MHz, chloroform-d) 6 ppm 7.54 (s, 1H), 7.43 (dd, J = 7.78, 2.01 Hz, 
1H), 7.16-7.29 (m, 2H), 4.36 (br s, 1H), 4.17 (s, 2H), 3.80 (dt, J= 13.68, 4.33 Hz, 
2H), 3.18-3.31 (m, 2H), 2.03 (br d, J= 13.30 Hz, 2H), 1.59 (ddd, J= 13.87, 
10.23, 4.27 Hz, 2H), 1.35-1.42 (m, 9H), 1.32-1.35 (m, 3H). ®C NMR (101 MHz, 
chloroform-d) 6 ppm 154.47, 153.58, 149.85, 138.65, 133.72, 132.40, 130.33, 
130.11, 127.89, 125.61, 119.19, 79.23, 50.57, 40.60 (x2), 35.83 (x2), 28.46 
(<3), 28.21. HRMS calculated for C);}H gCl,N;O2 (M+H)* 454.1591, found 
454.1591. 

A solution of tert-butyl (1-(6-amino-5-(2,3-dichlorophenyl)pyrazin-2-yl)- 

4-methylpiperidin-4-yl)carbamate (113 mg, 0.250 mmol) in THF:H,O (4:1, 
2.5 ml) was treated with HCl (4M in dioxane, 23011, 0.928 mmol). The resulting 
mixture was stirred for 2 h at 140°C. After cooling to room temperature, the 
volatiles were removed under reduced pressure, and the resulting residue was 
diluted with EtOAc (10 ml), H.O (10 ml). The phases were separated and the 
aqueous was further extracted with EtOAc (2 x 5 ml). The combined organic 
phases were discarded; the aqueous phase was basified to pH 9 with NaOH 2M, 
and extracted with EtOAc (3 x 10 mL). The combined organic phases were dried 
over MgSO,, filtered and the solvent was removed under reduced pressure to 
afford 6-(4-amino-4-methylpiperidin- 1-yl)-3-(2,3-dichlorophenyl)pyrazin-2- 
amine (84mg, 0.238 mmol) asa yellow solid. 'H NMR (400 MHz, methanol-d,) 
Sppm 7.61 (dd, J=7.91, 1.63 Hz, 1H) 7.47 (s, 1H) 7.40 (t, J= 7.78 Hz, 1H) 7.34 
(dd, J=7.65, 1.63 Hz, 1H) 3.78 (ddd, J= 13.43, 7.15, 4.27 Hz, 2H) 3.50-3.64 (m, 
2H) 1.55-1.75 (m, 4H) 1.25 (s, 3 H). 8C NMR (101 MHz, methanol-d,) 6ppm 
155.42, 152.89, 139.87, 134.55, 133.90, 131.60, 131.48, 129.24, 125.78, 117.54, 
54.84, 42.18 (x2), 39.40 (x2), 27.55. HRMS calculated for Cjs5H29Cl,N (M+H)* 
352.1096, found 352.1099. 
Crystallization and structure determination. Sitting drop vapour diffusion 
method was used for crystallization, with the crystallization well containing 17% 
PEG 3350 and 200 mM ammonium phosphate and a drop with a 1:1 volume of 
SHP2 protein and crystallization solution. Crystals were formed within five days, 
and subsequently soaked in the crystallization solution with 2.5mM SHP099. 
This was followed by cryoprotection using the crystallization solution with the 
addition of 20% glycerol and 1 mM of compound 1, followed by flash freezing 
directly into liquid nitrogen. 

Diffraction data for the SHP2-SHP099 complex were collected on a Dectris 
Pilatus 6M Detector at beamline 17ID (IMCA-CAT) at the Advanced Photon 
Source at Argonne National Laboratories. The data were measured from a 
single crystal maintained at 100 K at a wavelength of 1 A, and the reflections were 
indexed, integrated, and scaled using XDS*°. The space group of the complex 
was P2, with two molecules in the asymmetric unit. The structure was deter- 
mined with Fourier methods, using the SHP2 apo structure’” (PDB accession 
2SHP) with all waters removed. Structure determination was achieved through 
iterative rounds of positional refinement and model building using BUSTER” 
and COOT’s, yielding the published SHP2-SHP099 binary complex struc- 
ture (PDB accession number 5EHR). Individual B-factors were refined using 
an overall anisotropic B-factor refinement along with bulk solvent correction. 
The solvent, phosphate ions, and inhibitor were built into the density in later 
rounds of the refinement. Data collection and refinement statistics are shown in 
Extended Data Table 4. 

Isothermal titration calorimetry. The binding of SHP099 was studied by iso- 
thermal titration calorimetry (ITC) using the Auto iTC-200 calorimeter from 
Malvern Instruments. SHP2 was dialysed and SHP099 was dissolved into the 
identical buffer composed of 25 mM Hepes (pH 7.5), 100 mM NaCl, and 0.25 mM 
TCEP with 2% DMSO. The titration was performed at 25°C by injecting 2.5 il 
aliquots of SHP099 into the calorimetric cell (~200 11) containing the protein 
at a concentration of 55|1.M. The concentration of SHP099 in the syringe was 
450 1M. The heat evolved upon each injection was obtained from the integral of 
the calorimetric signal. The individual heats were plotted against the molar ratio, 
and the enthalpy change (AH) and association constant (K, = 1/Ka) were obtained 
by nonlinear regression of the data. 

p-ERK cellular assay. p-ERK cellular assay was using the AlphaScreen SureFire 
Phospho-ERK 1/2 Kit (PerkinElmer): A2058, KYSE520 or MDA-MB-468 
cells (30,000 cells per well) were grown in 96-well plate culture overnight and 
treated with SHP099 at concentrations of 20, 6.6, 2.2, 0.74, 0.24,0.08, 0.027 1M 
for 2h at 37°C. Incubations were terminated by addition of 30 1] of lysis buffer 
(PerkinElmer) supplied with the SureFire phospho-extracellular signal-regulated 
kinase (p-ERK) assay kit (PerkinElmer). Samples were processed according to the 
manufacturer’s directions. The fluorescence signal from pERK was measured in 
duplicate using a 2101 multi-label reader (PerkinElmer EnVision). The percentage 


LETTER 


of inhibition was normalized by the total ERK signal and compared with the 
DMSO vehicle control. 

Colony formation assay and cell proliferation assay. KYSE520, MDA-MB-468, 
A2058, Sum52, KatolII cells (1500 cells per well) were plated onto 24-well plates 
in 3001 medium (RPMI-1640 containing 10% FBS, Lonza). For drug treatment, 
SHP099 were added at various concentrations (10, 5, 2.5, 1.25,1M) 24h and 5 days 
after cell plating. At day 11, colonies were stained with 0.2% crystal violet (MP 
Biomedicals) and subsequently dissolved in 20% acetic acid for quantitation using 
a Spectramax reader (Thermo Scientific). In cell proliferation assay, KYSE520, 
A2058 and colorectal cancer cells (1500 cells per well) were plated onto 96-well 
plates in 10011 medium (RPMI-1640 containing 10% FBS, Lonza) and treated 
with SHP099 and/or lapatinib concentration varying from 0.0 to 20.25 1M. 
At day 5, 5011 CellTiter-Glo reagent (Promega) was added, and the luminescent 
signal was determined according to the supplier’s instruction (Promega). Method 
of profiling SHP099 in the haematopoietic cancer cell panels were described 
previously!°, Cells were treated using SHP099 concentration varying from 0.0 to 
301M. Cellular viability was measured using CellTiter-Glo at day 3. 

Western blotting. Cells were lysed on ice for 30 min with CST lysis buffer (Cell 
Signaling) containing phosSTOP (Roche). Cell lysates were centrifuged at 4°C 
for 15 min with a microfuge. Protein concentrations of cell lysate supernatants 
were measured. Cell lysate supernatants of equal-amount proteins were used for 
immunoblotting. The following antibodies were used: SHP2 (Santa Cruz SC-280), 
pERK (CST #43778), ERK (Santa Cruz SC-93), RAS (CST#3965S), pAKT (CST 
#4060S), GAPDH (CST #21188). 

Tumour xenograft experiments and tissue-sample preparations. All animal 
studies were carried out according to the Novartis Guide for the Care and Use 
of Laboratory Animals. Cell lines were confirmed to be devoid of mycoplasma 
and mouse viruses before use. Sample sizes were determined roughly on the basis 
of a power analysis using historical internal xenograft tumour volume data and 
anti-tumour responses. In the efficacy studies, animals were randomly assigned 
to treatment groups by an algorithm that moves animals around to achieve the 
best case distribution to assure that each treatment group has similar mean 
tumour burden and standard deviation. Female athymic nude mice (9-12 weeks 
of age) were inoculated subcutaneously (3 x 10° cells in a suspension containing 
50% phenol red-free matrigel (BD Biosciences) in Hank’s balanced salt solution) 
with parental KYSE520 cells or KYSE520 cells stably expressing dox-inducible 
control non-targeting shRNA or distinct PTPN11-targeting shRNA. For phar- 
macokinetic/pharmacodynamics studies, mice were administered a single dose 
of vehicle control, erlotinib, or SHP099 by oral gavage once tumours reached 
roughly 500 mm%. Mice were subsequently killed at predetermined time points 
following a single dose of compound, at which point plasma and xenograft frag- 
ments were collected for determination of SHP099 concentrations and p-ERK 
modulation, respectively. For efficacy studies, xenograft tumours were measured 
twice weekly by calipering in two dimensions. Once tumours reached roughly 
200 mm’, mice were randomly assigned to treatment groups. For the shRNA 
study, on day 10 after cell line implantation, mice were assigned to receive either 
vehicle diet (standard diet) or dox-supplemented diet (Mod LabDiet 5053, 
400 p.p.m. dox) for the duration of the study. For the efficacy study, on day 10 
after cell line implantation, mice were assigned to receive either vehicle, SHP099 
(100 mg per kg daily), or erlotinib (80 mg per kg daily) by oral gavage. In both 
efficacy studies, tumour volume and mouse body weight was assessed twice 
weekly. To assess MAPK pathway modulation in xenograft protein lysates, total 
and phospho ERK1/2 was assessed using a commercially available kit (Meso 
Scale Discovery catalogue number K15107D). The assay was conducted as rec- 
ommended by Meso Scale Discovery with the exception that protein lysate was 
incubated overnight. The development of the patient-derived AML xenograft 
model in mice and study design has been previously described”’. An additional 
group of mice (n= 7) was added to the study on day 35 after tumour implantation 
and treated with SHP099 (75 mg per kg daily) for 34 days. At the end of the study 
(69 days after tumour implantation), mice were euthanized, and spleen weights 
of individual mice were recorded. In all cases, no data or animals were excluded 
and results are expressed as mean and standard error of the mean. No further 
statistical analysis was performed. 

Pharmacokinetics. Plasma samples were precipitated and diluted with acetoni- 
trile containing internal standard and prepared for LC-MS/MS. An aliquot (201) 
of each sample was injected into the API4000 LC-MS/MS system for analysis, 
and transitions of 352.05 Amu (Q1) and 267.10 amu (Q3) were monitored. All 
pharmacokinetic parameters were derived from concentration-time data by non- 
compartmental analyses. Pharmacokinetic parameters were calculated using the 
computer program WinNonlin (Version 6.4) purchased from Certara Company. 
Results are expressed as mean and standard error of the mean. No further statis- 
tical analysis was performed. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


22. 
23. 


24. 


Hoffman, G. R. et a/. Functional epigenetics approach identifies BRM/ 
SMARCA2 as a critical synthetic lethal target in BRG1-deficient cancers. 
Proc. Natl Acad. Sci. USA 111, 3128-3133 (2014). 

Shao, D. D. et a/. ATARiS: computational quantification of gene suppression 
phenotypes from multisample RNAi screens. Genome Res. 23, 665-678 
(2013). 

Clare, J. J., Tate, S. N., Nobbs, M. & Romanos, M. A. Voltage-gated 
sodium channels as therapeutic targets. Drug Discov. Today 5, 506-520 
(2000). 


25. 
26. 
27. 
28. 


29: 


Zhao, H. et al. lsoxazole carboxylic acids as protein tyrosine phosphatase 1B 
(PTP1B) inhibitors. Bioorg. Med. Chem. Lett. 14, 5543-5546 (2004). 

Kabsch, W. XDS. Acta Crystallogr. D 66, 125-132 (2010). 

Bricogne, G. et al. BUSTER version 2.8.0. (Global Phasing Ltd., 2009). 

Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. Features and development of 
Coot. Acta Crystallogr. D 66, 486-501 (2010). 

Weisberg, E. et a/. Inhibition of wild-type p53-expressing AML by the novel 
small molecule HDM2 inhibitor C@M097. Mol. Cancer Ther. 14, 2249-2259 
(2015). 


© 2016 Macmillan Publishers Limited. All rights reserved 


SUM52 (FGFR2) 


shRNA 
Control SHP2 


SHP2 KD 
+ + 


KYSE520 (EGFR) 


shRNA 
Control SHP2 


-Dox 


+Dox 


cntl_ SHP2KD 
+ + 


Dox 


SHP2 <q 
ERK SSeS 


A2058 (BRAF V500E) 


shRNA 
Control SHP2 


-Dox 


+Dox 
SHP2 KD 
Dox - + 
SHP2 
GADPH ‘rue 
c SHP2-shRNA 


GFP HA-SHP2“T HA-SHP2°59S 


Extended Data Figure 1 | SHP2 depletion inhibits the growth of 
RTK-amplified cancer cells. a, Cells expressing dox-inducible SHP2 
shRNA in various RTK-amplified cancer cells were generated including 
SUM52 (FGFR2), KATOIII (FGFR2), MDA-MB-468, KYSE520 (EGFR), 
NCI-H2190 (HER2), NCI-H2228 (EML4-ALK). b, Stable clones of 
MDA-MB-231 (KRASS}°) and A2058 BRAPY®) cancer cells were 
established as controls. Cells were treated with dox, and colony formation 
was measured after 11 days by crystal violet staining. c, Western blot 


KATO III (FGFR2) 


shRNA 
Control SHP2 


+Dox 


NCI-H2170 (ERBB2) 


shRNA 
Control SHP2 


-Dox 


MDA-MB-231 (KRAS 6130) 


shRNA 
Control SHP2 


-Dox 


+Dox 


cntl_ SHP2KD 
Dox - + - + 


sH?2 == 


Ptubulin 


+ Dox 


MDA-MB-468 (EGFR) 


shRNA 
Control SHP2 


entl_ = SHP2KD 
Dox - + - + 


NCI-H2228 (EML4/ALk) 


shRNA 
Control SHP2 


cntl_ SHP2KD 

Dox - + - + 
SH?2 = =—— 

GADPH 9 meses 


SHP2-shRNA 
HA-KRASS12v 


Dox 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


showing the expression of SHP2, p-ERK and ERK in the presence (+) or 
absence (—) of dox in MDA-MB-468 SHP2-depleted cells stably expressing 
either GFP, wild-type HA-SHP2 or HA-SHP2“°"S. d, Western blot of 
SHP2, p-ERK and ERK in the presence (+) or absence (—) of dox in 
SHP2-depleted SUMS52 cells expressing vector control or HA-KRAS®Y, 
In c and d, dox treatment induces depletion of endogenous SHP2 protein 
and simultaneously expression of the exogenous proteins GFP, wild-type 
HA-SHP2, HA-SHP2“*"S or HA-KRAS°?Y, 


LETTER 


a See pRK b SHP2-shRNA c 
3 
HA-SHP2T — HA-SHP2°459S § < eg 5 
= x - + - + Dox a a 5 4 
aa = S 
os @ = 
< < Ss 3 
x= x= & 
; 2 2 
= 
-Dox 8 
x 1 
x 
n 
HA-SHP2“? HA-SHP2 “4595 
+Dox SHP2-shRNA 
Extended Data Figure 2 | Phosphatase activity is required for cancer variant re-expression perform using dox treatment. Colony formation 
growth. a, Western blot of SHP2, p-ERK and ERK in SHP2-depleted was monitored after 11 days with crystal violet staining. c, Phosphatase 
SUM5S2 cells stably expressing vehicle, wild-type SHP2 or HA-SHP2“", activity of SHP2-depleted SUMS2 cells stably expressing HA-SHP2 or 
Note, four lanes corresponding to an unrelated study were removed from HA-SHP2“°°S, Cells were treated with dox for 3 days. SHP2 protein 
the image. All lanes originated from the same gel at the same exposure. was immunoprecipitated from cell lysates and phosphatase activity was 
b, Colony formation of SHP2-depleted SUM52 cells stably expressing measured using DiFMUP assay. Data are presented as mean + s.d. (n= 3). 


vehicle, wild-type SHP2 or HA-SHP2“*°°S, SHP2 knockdown and SHP2 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a Time (min) b 


Oo 
oO 
2 
[vy 
[S} 
—_ 
0.0 Chid2/DoF = 1.809E4 =, = 
N 1.07  +0,00459 Sites 
S K 1.38E7 +2.90E6 M" bd 
B .20-[ 4H 6209 249.58 cal/mol 
oO AS 11.8 cal/mol/deg 
Cc 
= 4 
key 
T 404 7 
fe} 
— ] 
a 
1S) 
x -6.0- es eee gel 
— Te r T 7 T 
0.0 05 1.0 1.5 
Molar Ratio 


c e 
124 
e © wt 
§ 104 4 1T253M-Q257L 
= YF Q257L 
= 
& 
= 
2 
oO 
oO 
N 
o 
=I 
7) 
0.001 0.01 0.1 1 10 
2P-IRS-1 peptide (uM) 
SHP2 variant 2P-IRS-1 activation SHP099 
ACs0 (uM) ICso (uM) 
d 
WT 0.22 0.071 
Q257L 0.24 1.33 
T253M/Q257L 0.38 > 100 
Extended Data Figure 3 | Thermodynamic characterization of the superimposition. Central tunnel is significantly larger in SHP1 owing to a 
SHP099-SHP2 binding complex, comparison of SHP2 and SHP1’s change in orientation of the C-SH2 domain. This change repositions the 
allosteric pocket and characterization of SHP099-resistant SHP2 linker between the two SH2 domains removing several key interactions, 
mutants. a, Isothermal titration calorimetry of SHP099 binding to SHP2. highlighted by residue Arg109 in SHP1 and by Arg111 in SHP2 (equivalent 
SHP099 binds stoichiometrically to SHP2 with a dissociation constant residues). e, Biochemical activity of wild-type SHP2, SHP22257! and 
measured at 73 + 15nM. b, Structural differences in the central tunnel SHP2753M/Q257L_ SHP2 activity was determined using DiFMUP in the 
between SHP1 and SHP2. Ribbon representation of SHP2 (multi-colour) presence of various concentrations of 2P-IRS-1. Data points along the line 
and SHP1 (grey) X-ray structures in the closed conformation. The PTP represent the mean of two replicate values. SHP2@°”" and SHP21293M/Q257L 
(tan) and N-SH2 (green) domain overlay well (1.m.s.d. < 1.5 A), however retain activity regulation and 2P-IRS-1 activation potential comparable 
the C-SH2 (blue) domain has a significantly different orientation. to wild-type SHP2 but are 18- and <1,000-fold less sensitive to SHP099 


c, Surface representation of SHP2-SHP099 co-crystal structure. d, Surface —_ inhibition. 
representation of SHP1 with SHP099 modelled on the basis of the SHP2 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 
KYSE520 MDA-MB-468 A2058 
SHP099 SHPO099 SHPO099 
— | — | 
p-ERK i p-ERK Pe 
ERK > — erk ST | 2 oe 
p-AKT p-AKT | ee ee 
b 
MDA-MB-468 
KYSE520 
A2058 
c 120 20] [@RAS/RAF-altered e oj e @ 
e 
p 100 be 
o = 15 
2 a0 . 
8 S e 
2 60 2 10 
8 & 
oO xr 
a0 Kasumi-1 2 SNU-175 
o 5 O pcw-2 
E KYSE520 Ov copBet 
£ i HT115 O e 
§ 20 A2058 NOHH5089) 
QSNu-c4 
0 


0.01 


0.1 1 
SHPO99 (1M) 


Extended Data Figure 4 | Cellular activity of SHP099. a, Western blot of 
SHP2, p-ERK, ERK and p-AKT from KYSE520, MDA-MB-468 or A2058 
cells treated with SHP099 (1, 3, 101M). Note, three lanes corresponding 
to an unrelated study were removed from the image. All lanes originated 
from the same gel at the same exposure. b, Colony formation of 
KYSE520, MDA-MB-468, and A2058 in the presence of SHP099. Colony 
formation was measured after 11 days of SHP099 treatment by crystal 
violet staining. c, SHP099 inhibitory activity against cell lines KYSE520 
(EGFR-amplified), MV-411(FLT3-ITD), MOLM-13 (FLT3-ITD), Kasumi 


10 


0.1 1 
Lapatinib IC59 (uM) 


(c-Kit altered) and negative control A2058 (BRAFY°*) treated with 
SHP099 concentration varying from 0.0046 to 20 j1M. Cellular viability 
was measured using CellTiter-Glo. Data presented as mean +s.d. 

(n= 3). d, Comparison of SHP099 activity with Lapatinib in a panel of 
26 colorectal cell lines. SHP099 sensitivity correlates with sensitivity to 
Lapatinib, a potent tyrosine kinase inhibitor against Her1/2 and EGER. 
RAS- and BRAF-mutated cell lines are shown in blue. Cellular viability 
was measured using CellTiter-Glo. The corresponding data and cell line 
genotypes are included in Supplementary Information Table 2. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


. : 1.55 
Co ~Control shRNA - Dox 
control-shRNA SHP2-shRNA a a ma Control shRNA + Dox 
> SHP2 shRNA - Dox 
ee ee ee ee gx ad lm SHP2 shRNA + Dox 
£& 
ow | 
& 
Oo 
a 
0.0-'4 
. * Vehicle a 
10005 10mg/kg qd iy - Vehicle 
cae 30miaikeg Gil ; ~~ SHP099 
—~ 8005 & 30mg/kg bid i) > Enlotinib 
oO 
. ~~ 100mg/kg q2d 5 10 
o 6004 © 100mg/kg qd = 
§ i=) a 
) 
‘o 0 
2 4004 = 
g 3 
5 a 
F 2004 il 
x 
0 T T T T T 1 -20 T T T Ul T 
9 12 15 18 21 24 27 10 16 19 22 25 
Days post implantation Days post implantation 
f 
e 
207 4 
+& Vehicle i 
is -©- SHP099 
2 — 
oO D 
5 = 4004 
— c= 
£& 
2 2 
= = 
> G 
8 46 8 2004 
é- 3 
fe) 
3s 
Seece 
-2.0 ry 0 : 
35 40 45 50 55 60 65 70 75 SHP099 
Days post implantation 
Extended Data Figure 5 | SHP2 depletion or inhibition by SHP099 SHP099 (100 mg per kg daily), erlotinib (80 mg per kg daily) or vehicle 
assessed in in vivo xenograft models. a, Western blot of SHP2 and for 14 consecutive days. Data are presented as treatment mean + s.e.m. 


GAPDH in KYSE520 xenograft lysates following 14 days of dox treatment (n=7). e, Body weight of mice bearing orthotopic primary tumour 
(+). b, Response of p-ERK in tumour xenograft lysates following 14 days derived xenografts and administered an oral gavage of SHP099 at 75 mg 


of dox treatment. c, Antitumour efficacy of SHP099 administered orally per kg daily. Data are presented as treatment mean +s.e.m. (n= 7). f, The 
for 14 consecutive days at the doses and schedules indicated. Data are mouse spleen weight measurement of mice bearing the orthotopic AML 
plotted as the treatment mean + s.e.m. (n= 9). d, Body weight of mice xenograft model following 34 days of once-daily dosing with SHP099 at 
bearing subcutaneous KYSE520 xenografts and administered either 75 mg per kg. Data are presented as treatment mean + s.e.m. (n =7). 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 1 | Selectivity profiling of SHPO99 in phosphatase enzyme panel 


Native phosphatase 
Eurofins sequence present in 
Phosphatase Species recombinant protein | ICs0, uM 
CD45(h) Human 598-end >100 
DUSP22(h) Human Full length >100 
HePTP(h) Human 22-end >100 
LMPTP-A(h) Human Full length; Q106R >100 
LMPTP-B(h) Human Full length >100 
MKP5(h) Human 320-end >100 
PP1a(h) Human Full length >100 
PP2A\(h) Human Native enzyme >100 
PP5(h) Human Full length >100 
PTPb(h) Human 1643-end >100 
PTP-1B(h) Human 1-321 >100 
PTP MEG1(h) Human 423-end >100 
PTP MEG2(h) Human 283-end >100 
PTPN22(h) Human 1-312 >100 
RPTPm(h) Human 879-1184 >100 
SHP-1(h) Human Full length >100 
SHP-2(h) Human 230-545 >100 
TCPTP(h) Human 1-341 >100 
TMDP(h) Human Full length >100 
VHR(h) Human Full length >100 
YopH(Yersinia) Yersinia Full length; R211A >100 


Assay was performed using PhosphataseProfiler at Eurofins. 


© 2016 Macmillan Publishers Limited. All rights reserved 


Extended Data Table 2 | Selectivity profiling of SHPO99 in kinase enzyme panel 


Kinases IC eq, WM Kinases ICeq, WM 
CE ABL1 (64-515) nonphos v2 >10 CE MAPK‘ >10 
CE ACVR1 (172-499) >10 CE MAPK10 >10 
CE AKT1 >10 CE MAPK14 >10 
CE ALK (1066-1459) >10 CE MAPKAPK2 >10 
CE AURKA >10 CE MAPKAPK5 (2-472) >10 
CE BTK >10 CE MET (956-1390) >10 
CE CAMK2D >10 CE MKNK‘1 >10 
CE CDK1B >10 CE MKNK2 >10 
CE CDK2A >10 CE PAK2 >10 

CE PDGFRa 
CE CDK4D1 >10 (551-V561D-1089) >10 
CE CSK >10 CE PDPK1 >10 
CE CSNK1G3 (35-362) >10 CE PIM2 >10 
CE EGFR (668-1210) >10 CE PKN1 >10 
CE EPHB4 (566-987) >10 CE PKN2 >10 
CE ERBB4 (673-1308) >10 CE PLK1 >10 
CE FGFR1 (407-822) >10 CE PRKACA >10 
CE FGFR2 (406-821) 9.7 CE PRKCA >10 
CE FGFR3 (411-806) >10 CE PRKCQ >10 
CE FGFR3 (411-K650E-806) >10 CE RET (658-1072) >10 
CE FGFR4 (388-802) >10 CE ROCK2 (6-553) >10 
CE FLT3 (563-D835Y-993) >10 CE RPS6KB1 (1-421) >10 
CE GSK3B 9.9 CE SRC (1-535) >10 
CE INSR (871-1343) >10 CE STK17B >10 
CE IRAK1 (184-712) >10 CE SYK (2-635) nonphos | >10 
CE IRAK4 (1-460) >10 CE WNK1 (2-491) >10 
CE JAK1 (866-1154) >10 CE ZAP70 >10 
CE JAK2 (808-1132) >10 ADP-FRET PIK3CD >10 
CE KDR (807-1356) >10 ADP-FRET PIK3CG >10 
ATP-binding MTOR(1360- 

CE KIT (544-976) >10 2549) >10 
CE LCK (1-508) >10 KGlo PIK3C3 >10 
CE LYN (1-512) >10 KGlo PIK3CA >10 

CE MAP3K8 (30-404) >10 KGlo PIK3CB >10 
CE MAP4K4 >10 KGlo PIK4CB >10 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


LETTER 


Extended Data Table 3 | Preclinical safety pharmacology off-target activity panel 


ACs0 (uM) 

ASSAY Antagonism Agonism 
5HT2C >30 N/A 
Adi >30 N/A 
Ad2A >30 N/A 
Ad3 >30 N/A 
alpha2B >30 N/A 
alpha2C >30 N/A 
beta1 >30 N/A 
AT1 >30 N/A 
CCKa >30 N/A 
D2 >30 N/A 
D3 >30 N/A 
ETa >30 N/A 
GHS >30 N/A 
H1 >30 N/A 
H3 >30 N/A 
MC3 >30 N/A 
h Motilin >30 N/A 
M1 >30 N/A 
M3 >30 N/A 
op-delta >30 N/A 
op-mu >30 N/A 
hr Via >30 N/A 
Nic(ns) >30 N/A 
5HT3 6.7 N/A 
AdT >30 N/A 
DAT >30 N/A 
NET >30 N/A 
5SHTT >30 N/A 
COX-1 >30 N/A 
COX-2 14 N/A 
MAO-A >30 N/A 
h PDE3 >30 N/A 
5HT1A > 30 > 30 
5HT2A > 30 > 30 
5HT2B > 30 > 30 
Alpha 1A > 30 > 30 
Alpha 2A > 30 > 30 
Beta2 > 30 N/A 
CB1 > 30 > 30 
D1 > 30 N/A 
GABAA > 30 > 30 
M2 > 30 > 30 
rrAR > 30 N/A 
ERalpha 12 >10 
GR > 30 > 30 
PPARg > 30 > 30 
PR-B > 30 > 30 
PXR > 30 > 30 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 4 | Crystallographic Data and Refinement Statistics 


Parameters 


SHP2/SHP099 complex 


Space group 

Unit Cell (A) 
Resolution range (A) 
Total observations 
Unique reflections 
Completeness (%)? 
Multiplicity 

<I/o(I) >? 

Rmerge®? 

Reryst! Riree® 

Protein atoms 
Heterogen atoms 
Solvent molecules 
Average B-factor (A?) 
R.m.s.d. bond lengths (A) 


R.m.s.d. bond angle (°) 


Ramachandran Plot (%) 
Favored 
Allowed 
Outliers 


P21 

a=46.19, b=213.79, c=55.89 
24.89 — 1.70 (1.74 — 1.70) 
371305 

114992 

97.8 (87.3) 

3.2 (2.4) 

17.5 (2.4) 

0.034 (0.373) 
0.195/0.221 (0.235/0.257) 
7768 

76 


"Numbers in parentheses are for the highest resolution shell. 
Rmerge = Z\lh-— </p>|/Z/, over all h, where /,, is the intensity of reflection h. 


Reryst AN Riree = D||Fo| — |Fe||/E|Fo|, where F, and F, are observed and calculated amplitudes, respectively. Riree was calculated using 5% of data excluded from the refinement. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


doi:10.1038/nature18629 


Inflammasome-activated gasdermin D causes 
pyroptosis by forming membrane pores 


Xing Liu)?*, Zhibin Zhang!*, Jianbin Ruan!**, Youdong Pan‘, Venkat Giri Magupalli!*, Hao Wu!’ & Judy Lieberman! 


Inflammatory caspases (caspases 1, 4, 5 and 11) are activated 
in response to microbial infection and danger signals. When 
activated, they cleave mouse and human gasdermin D (GSDMD) 
after Asp276 and Asp275, respectively, to generate an N-terminal 
cleavage product (GSDMD-NT) that triggers inflammatory 
death (pyroptosis) and release of inflammatory cytokines such 
as interleukin-18'”. Cleavage removes the C-terminal fragment 
(GSDMD-CT), which is thought to fold back on GSDMD-NT to 
inhibit its activation. However, how GSDMD-NT causes cell death 
is unknown. Here we show that GSDMD-NT oligomerizes in 
membranes to form pores that are visible by electron microscopy. 
GSDMD-NT binds to phosphatidylinositol phosphates and 
phosphatidylserine (restricted to the cell membrane inner leaflet) 
and cardiolipin (present in the inner and outer leaflets of bacterial 
membranes). Mutation of four evolutionarily conserved basic 
residues blocks GIDMD-NT oligomerization, membrane binding, 
pore formation and pyroptosis. Because of its lipid-binding 
preferences, GSDMD-NT kills from within the cell, but does not 
harm neighbouring mammalian cells when it is released during 
pyroptosis. GIDMD-NT also kills cell-free bacteria in vitro and 
may have a direct bactericidal effect within the cytosol of host cells, 
but the importance of direct bacterial killing in controlling in vivo 
infection remains to be determined. 

We hypothesized that GIDMD-NT might form pores that per- 
meabilize mammalian membranes during pyroptosis. To examine 
whether GIDMD-NT oligomerizes, we expressed Flag-tagged mouse 
GSDMD-NT or GSDMD in HEK293T cells and analysed the lysates 
by SDS-PAGE and Flag-immunoblot (Extended Data Fig. 1a, b). 
Under non-reducing conditions, GIDMD-NT migrated as both an 
~30kDa monomer and >250kDa multimer. The multimeric band 
disappeared under reducing conditions, suggesting that GGDMD-NT 
oligomerization requires disulfide-cross-linking. Flag-GSDMD 
migrated mostly as a monomer, but a dimeric band was also formed 
when reactive sulfhydryl groups were not blocked, suggesting that these 
dimers formed during lysis. When the same cell lysates were analysed 
by native gel electrophoresis, high molecular weight oligomers were 
visualized selectively in cells transfected with Flag-GSDMD-NT 
(Fig. 1a). To confirm the association of multiple GIDMD-NT sub- 
units in the oligomer, we transfected HEK293T cells with Flag- and 
haemagglutinin (HA)-tagged GSDMD-NT. Immunoprecipitation with 
either anti-Flag (Fig. 1b) or anti-HA (Extended Data Fig. 1c) antibodies 
pulled down both tagged species, confirming that GIDMD-NT self- 
associates and might form homo-oligomers. When the co-immuno- 
precipitation was repeated in cells transfected with Flag-GSDMD-NT, 
Flag-GSDMD-CT, and/or GIDMD-CT-MYC, the two species of 
GSDMD-CT did not co-precipitate, but Flag-GSDMD-NT associated 
with MYC-tagged GSDMD-CT (Extended Data Fig. 1d). 

Ectopic caspase-11 expression triggers pyroptosis in GSDMD- 
expressing cells’. To determine whether caspase-11 activates GIDMD 


2 


cleavage and oligomerization, we co-transfected HEK293T cells, which 
do not express GSDMD, with plasmids encoding Flag-GSDMD and 
wild-type or enzymatically dead (C254A) caspase-11. We analysed 
cell death by measuring lactate dehydrogenase release and GIDMD 
oligomerization using SDS-PAGE and immunoblot, probed for Flag 
and caspase-11 (Fig. 1c). 60% of Flag-GSDMD-expressing cells co- 
expressing wild-type, but not mutant, caspase-11 were killed (Extended 
Data Fig. le). Only wild-type caspase-11 generated GIDMD-NT and 
its oligomer. Similar results were obtained when immortalized mouse 
bone-marrow-derived macrophages (iB MDMs) stably expressing Flag- 
GSDMD were electroporated with lipopolysaccharide (LPS) to activate 
caspase-11 (ref. 4; Fig. 1d, Extended Data Fig. 1f). Thus caspase-11 
cleaves GSDMD, causing GSDMD-NT oligomerization and pyroptosis. 

We hypothesized that GIDMD-NT oligomers form cell membrane 
pores that kill cells. Pore-forming proteins often use positively charged 
amphipathic structures for membrane insertion* ’. To identify potential 
functional pore domains, we searched for evolutionarily conserved, 
positively charged residues in GSDMD-NT, comparing the sequences of 
six mammalian species using the Clustal Omega and SOPMA secondary 
structure prediction server®. A cluster of four such residues occurs 
in a pair of predicted amphipathic a-helices (mouse Arg138, Lys146, 
Arg152, Arg154) (Fig. le, upper panel). Because of their possible 
importance, we engineered mutant forms of mouse Flag-GSDMD-NT 
containing 2, 3 or 4 Arg or Lys to Ala mutations of these residues. 
These changes were not predicted to affect the secondary structure 
(Fig. le, lower panel), which was verified by showing that the melt- 
ing temperatures of wild-type and 4 Ala (4A)-mutated GIDMD-NT 
were similar (46.8°C and 45.6°C, respectively). We also generated a 
mutant protein in which Arg138 was mutated to Ser. We determined 
whether these mutations interfere with oligomerization and pyroptosis 
in HEK293T cells ectopically expressing GSDMD, wild-type or mutated 
GSDMD-NT (Fig. 1f, g). As expected, wild-type GIDMD-NT, but not 
GSDMD, triggered both pyroptosis and GIDMD-NT oligomerization. 
Mutation of all four basic residues completely blocked both pyroptosis 
and oligomerization, whereas mutations of two or three of the residues 
resulted in partial blocking. Ectopic Flag- and HA-tagged GSDMD-NT 
4A also did not co-immunoprecipitate (Extended Data Fig. 2a). Ala 
mutations of other conserved basic residues (Lys204, Lys205, Lys237, 
Arg239), alone or combined with mutations in nonconserved basic 
residues (Arg248, Lys 249) that were not within predicted amphipathic 
structures, did not affect pyroptosis (Extended Data Fig. 2b, data not 
shown). Oligomerization and cell death were correlated, suggesting that 
GSDMD-NT oligomers were responsible for pyroptosis. 

To verify that the 4A mutation inactivates pyroptosis, we knocked 
down Gsdmd in iBMDMs and assessed whether wild-type or 
4A-mutant GSDMD restored LPS-transfection-induced pyroptosis 
(Fig. 1h, Extended Data Fig. 2c, d). Gsdmd knockdown strongly inhib- 
ited pyroptosis, which was restored by ectopic expression of small inter- 
fering RNA (siRNA)-resistant wild-type, but not 4A mutant, Gsdmd. 


1Program in Cellular and Molecular Medicine, Boston Children’s Hospital, Boston, Massachusetts 02115, USA. Department of Pediatrics, Harvard Medical School, Boston, Massachusetts 02115, 
USA. Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, Massachusetts 02115, USA. Department of Dermatology and Harvard Skin Disease 


Research Center, Brigham and Women’s Hospital, Boston, Massachusetts 02115, USA. 
*These authors contributed equally to this work. 


7 JULY 2016 | VOL 535 | NATURE | 153 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Caspase-11 WT 
Caspase-11(C254A) 


Flag-GSDMD + 


Caspase-11 WT 


Flag-GSDMD + 
Caspase-11(C254A) - 


a < b c 
we RY IP: Flag 
SX & —— =" 
HA-GSDMD-NT - + + 
ei Sus Flag-GSDMD-| NT fp 5 (kDa) 
= 37 
Well = IB: HA | a 
neat 
| 37 > 
g| oligomer 1B: Flee -- a 
ira I-37 a 
is IB: HA 
éi ___ 


Input 


Mouse 
Human 
Rat 
Horse 
Cow 
Goat 


Oligomer > 


IB: Flag 


GSDMD > 
GSDMD-NT > 


Oligomer > 


GSDMD> 


Monomet 1 GSDMD-NT-> 


Non-reducing 


Caspase-11 


Reducing 


eeeeehhhhhhhhhhhhhhhccchhhhhhhhhhttcceeeeehhhhcccheeeeee 
VCILRVTQKTWETMQHERHLQQPENKI LQOLRSRGDDLFVVTEVLOTKEEVQITEV 
VYSLSVDPNTWOTLLHERHLRQPEHKVLOQQLRSRGDNVYVVTEVLOTQKEVEVTRT 
VCMLRVTQQTWE IMQRERHLQQPENKILOQLRNRGDDVFVVTEVLOTKEEVQITOV 
vc RLOQVVPNTWVAMQQERRLRQPEHKILQQLRNRGVDVFVVTEVLOTOKEVEVTXP 
LCTLRVTPNTWEAMHHERRLRQPEPRKTLOQLRSRGDDVFVVTEVLOTOKEVEVTRT 
LCTLRVAPNTWEAMHHERRLRQPEPKTLOQLRSRGHDVFVVTEVLOTOKEVEVIRT 


VCILRVTQKTWETMQHERHLQQPENKILQOLRSRGDDLFVVTEVLOTKEEVQITEV 
eeeeehhhhhhhhhhhhhhecccchhhhhhhhhhttcceeeeehhhhcccheeeeee 


Non-reducing Reducing 


60 


I dil 
le) 


Sor ‘ es 


“SEee “e 


Flag-GSDMD 


Oligomer> 


Cell death (%) 


IB: Flag 


Monomer-> 


IB: Flag 


Figure 1 | GSDMD-NT forms oligomers, disrupted by mutation of 
four positively charged residues. a, HEK293T cells, transfected with 
indicated plasmids, lysed under non-reducing conditions, were resolved 
on a native gel, immunoblotted for Flag. b, Flag immunoprecipitation of 
lysates of HEK293T cells, transfected with HA-GSDMD-NT and/or Flag- 
GSDMD-NT, were analysed by immunoblot. c, HEK293T cells, transiently 
transfected with indicated plasmids, were assessed 16h after transfection 
for oligomer formation by Flag immunoblot of non-reducing (left) or 
reducing (right) gels. The reducing gel was also blotted for caspase-11. 

d, iBMDMs expressing Flag-GSDMD were electroporated with 
phosphate-buffered solution (PBS), LPS or Pam3CSK4 and analysed 2h 
later by Flag immunoblot. e, The cluster of four evolutionarily conserved, 
positively charged amino acids (red and underlined) in GIDMD-NT were 
mutated to Ala. Secondary structures of the wild-type and 4A-mutated 
mouse GSDMD fragment were predicted using the SOPMA algorithm. 

h, helix; e, sheet; t, turn; c, coil. f, g, Mutations of mouse GIDMD-NT 


Because GSIDMD-NT oligomerization was inhibited by reducing 
agents, we also mutated the six Cys residues in the mouse protein and 
analysed oligomerization in transfected cells. Mutations of Cys39 or 
Cys192 impaired oligomerization, suggesting that intramolecular 
or intermolecular disulfide bonds between these residues are critical 
for oligomerization (Extended Data Fig. 2e). 

If GIDMD-NT forms plasma membrane pores, it should relocalize 
to the cell membrane after caspase activation. To assess membrane 
localization, we lysed cells co-transfected with wild-type or 4A Flag- 
GSDMD and wild-type or C254A caspase-11 in the detergent Triton 
X-114 to separate cytosolic proteins in the aqueous phase from mem- 
brane-associated proteins in the detergent phase? (Fig. 2a, b). A cleavage 
fragment that migrated in the same way as Flag-GSDMD-NT was only 
produced in cells transfected with wild-type caspase-11. Wild-type and 
4A Flag-GSDMD-NT were detected in the aqueous phase, but only 
wild-type Flag-GSDMD-NT partitioned into the detergent phase and 
associated with cell membranes. 

To determine with which membrane GSDMD-NT associates, we 
fractionated the post-nuclear supernatant of HEK293T cells, trans- 
fected with Flag-GSDMD, or wild-type or 44 GSDMD-NT expres- 
sion plasmids, into cytosolic (S100), heavy membrane (P7), light 
membrane (P20) and insoluble cytoplasmic fractions (P100) (Fig. 2c). 
Flag-GSDMD and 4A GSDMD-NT were solely in the $100 fraction, 


154 | NATURE | VOL 535 | 7 JULY 2016 


VCILRVTQKTWETMQH: 
eeeeehhhhhhhhhhhhhhhccchhi 


HLOOPENAIILOOLBSBGDDLFVVTEVLOTKEEVOITEV 
nhhhhhhttcceeeeehhhhcccheeeeee 


h 


7 


Cell viability (%) 


20 


Control siRNA 
Gsdmd siRNA 
Empty vector 
GSDMD WT 
GSDMD 4A 


Non-reducing 


Pris 
Pitt 
i a i | 
tre 


Es 


+ 


Reducing 


Unt. LPS electroporation 


block GSDMD-NT-mediated pyroptosis (f) and oligomerization (g). 
The first to fourth amino acids (R138/K146/R152/R154) were mutated to 
Ala. The mutated residues are indicated, i.e. in NT4A, all 4 residues are 
mutated; in NT 3, 4A, the 3rd and 4th residues are mutated. In NT1S3A, 
R138 was mutated to Ser, the other three residues were mutated to Ala. 
FL, full-length GSDMD; NT, GSDMD-NT. HEK293T cells, transfected 
with wild-type (WT) or mutated GSDMD-NT, were analysed 18h later 
for cell death (CytoTox96 assay) and GSDMD oligomerization (Flag 
immunoblot). GIDMD-NT monomer and oligomer are indicated. 

h, iBMDMs, co-transfected with control or Gsdmd siRNA and the indicated 
siRNA-resistant Gsdmd expression plasmids, were electroporated with 
LPS. The number of surviving cells was determined by CellTiter-Glo 
assay 2.5h later. Mean +s.d. of three technical replicates from one of 
three independent experiments are shown (f, h). Statistical differences 
are relative to Flag-GSDMD-NT-expressing samples (f). **P < 0.01 
(two-tailed t-test). NS, not significant; unt., not transfected with LPS. 


but Flag-GSDMD-NT fractionated with both the $100 and heavy 
membrane P7 fraction, which contains plasma-membrane fragments 
and mitochondria. When HEK293T cells, transfected to express Flag- 
GSDMD-NT, were separated into soluble and membrane fractions 
and analysed by immunoblot, the cytosolic fraction contained mostly 
monomeric Flag-GSDMD-NT, whereas the membrane fraction only 
contained the high molecular weight oligomer (Fig. 2d). We used 
confocal immunofluorescence microscopy to visualize the cellular 
distribution of transiently expressed Flag-tagged GSDMD and 
wild-type and 4A-mutant GSDMD-NT (Fig. 2e, f). Flag-GSDMD and 
oligomerization-defective 4A Flag-GSDMD-NT stained the cytosol dif- 
fusely, but Flag-GSDMD-NT concentrated on the plasma membrane. 
Thus, GSDMD-NT oligomerizes in the plaama membrane during 
pyroptosis. 

Lipid binding influences which membranes pore-forming 
proteins permeabilize. To identify which lipids GIDMD-NT binds, we 
incubated recombinant GDMD, GSDMD-NT, GSDMD-CT and 4A 
GSDMD-NT and the cytotoxic lymphocyte pore-forming proteins, 
perforin and granulysin, with membranes dotted with different lipids 
(Fig. 3a, b). Perforin permeabilizes mammalian membranes, whereas 
granulysin preferentially permeabilizes microbial membranes’. 
Consistent with our previous results, GIDMD, GSDMD-CT and 4A 
GSDMD-NT did not bind to any lipid. GIDMD-NT bound most 


© 2016 Macmillan Publishers Limited. All rights reserved 


a b 
Fraction Aqueous Detergent Fraction Aqueous Detergent 
Flag-GSDMD + + + + + + Flag-GSDMD + + - + +- 
Caspase-11WT - + - - + - Flag-GSDMD 4A - - + - - + 
Caspase-11(C254A) - - + - - + (kDa) Caspase-11 - + + - + + (kDa) 
75 
<4 GSDMD 50 
aj | GSDMD-NT so 
= 25 


zi 
Na*/K*-ATPase 01  &¢ 00 
\ eee 75 
c d 


Flag-GSDMD Flag-GSDMD- Flag~-GSDMD- 
NT 


Flag-GSDMD-NT 


NT 4A - &. & 
ee 
288 _°c88 _°88 — KF oF Mo 
Fraction PRES FLAG ENE GH (koa Fraction @ oN De) 
7s Oligomer > | = 250 
g GSDMD =- -50 450 
ira D 
& -37 & -100 
© | GSDMD-NT = = - = -75 
L a 
75 -50 
| 25 
100 - 
Na*/K*-ATPase o11 |= = = AZ nd -75 Tubulin = (2 
Na‘/K*-ATPase ort [ES 10° 
Reducing Non- 
e f as, reducing 
23 
Flag-GSDMD Flag-GSDMD-NT —_ Flag~-GSDMD-NT 4A 85 
£3 
Eo 
$2 
of 
2 
eB 
2 
oe 
lo} 
*8 


Figure 2 | GSDMD-NT localizes to the plasma membrane. a, b, Lysates 
of HEK293T cells, transfected with indicated plasmids for 16h, were 
separated into aqueous and detergent phases using Triton X-114, and 
analysed by immunoblot probed for Flag, tubulin, or Nat/K*-ATPase al. 
c, HEK293T cells, transfected with indicated plasmids for 16h, were 
separated into P7 (heavy membrane), P20 (light membrane), P100 
(insoluble cytosol) and $100 (soluble cytosol) fractions and analysed by 
immunoblot with indicated antibodies. d, Soluble and crude membrane 
fractions of HEK293T cells, transfected to express Flag-GSDMD-NT, 
were analysed by immunoblot as indicated. e, f, Representative confocal 
microscopy images (e) and quantification (f) of distribution of ectopic 
Flag-GSDMD, Flag-GSDMD-NT and Flag-GSDMD-NT 4A (green) 

in HeLa cells co-stained with DAPI (blue). The ratio of cells with 
membrane versus cytosolic Flag staining was calculated by counting 10 
high-power fields for each sample in 5 independent experiments (f). 

**P < 0.0001 (paired t-test). Scale bar, 20 1m. Data are representative of 
three independent experiments (a-d). 


strongly to the mitochondrial and bacterial lipid, cardiolipin, and to 
the phosphatidylinositol phosphates (PIPs), PtdIns(4)P and PtdIns(4,5) 
P2, and less strongly to phosphatidic acid (PA) and phosphatidylserine 
(PS), which are all on the mammalian cell membrane inner leaflet!*"’, 
It did not bind to phosphatidylethanolamine (PE) or phosphatidyl- 
choline (PC), the major lipids on both plasma membrane leaflets. 
Cardiolipin in the mitochondrial inner membrane is inaccessible to 
the cytosol’*. This binding pattern suggests that GIDMS-NT may 
selectively bind to the plasma membrane from within and to bacterial 
membranes. The outer leaflet of endosome and phagosome membranes 
contains the same phospholipids as the plasma membrane inner leaflet, 
suggesting that GIDMD-NT may also bind to these organelles. In 
comparison, perforin bound to PE, but not PS, and the same PIPs as 
GSDMD-NT, consistent with its role in permeabilizing mammalian 
cell membranes from the outside; granulysin also bound to cardi- 
olipin, consistent with its role in microbial immunity. Mixed lineage 
kinase domain-like protein, the pore-forming protein activated during 
necroptosis, which binds to the inner leaflet of the cell membrane, has 
a similar binding pattern as GIDMD-NT?>®. 

To confirm lipid binding by GIDMD-NT, we measured wild-type 
and 4A-mutant GSDMD-NT binding and disruption of PE-PC 


LETTER 


liposomes containing no added lipid or PtdIns(4)P, PtdIns(4,5)P2, PS 
or PA (Fig. 3c, d). 4A GSDMD-NT did not bind any of these liposomes, 
whereas wild-type GIDMD-NT bound to all of the liposomes con- 
taining the added phospholipids, but not to the PE-PC liposomes. To 
measure liposome leakage, PE-PC-PS liposomes were prepared that 
encapsulated Tb** ions. Alone, Tb** is weakly fluorescent, but fluo- 
resces strongly when bound to dipicolinic acid (DPA)'®. Fluorescence 
of PE-PC-PS liposomes in DPA-containing solutions sharply increased 
after adding wild-type, but not mutant, GIDMD-NT, indicating Tb?+ 
leakage. Similarly, PS-containing liposomes became leaky after incu- 
bation with caspase-11-treated GSDMD, but not after incubation with 
caspase-11 or GSDMD alone. Thus GSDMD-NT binds to liposomes 
containing PS or PIPs and disrupts them. The buffer used for these 
experiments is Ca‘t-free, suggesting that GIDMD-NT oligomeriza- 
tion, unlike perforin oligomerization, is Ca**-independent. 

We next used negative staining electron microscopy to visualize 
GSDMD-NT oligomers on PS-containing liposomes. Liposomes incu- 
bated with GSDMD and caspase-11 showed ruptured morphology, 
whereas those incubated with only GSDMD did not (Fig. 3e). The rup- 
tured liposomes were decorated with neck-like structures with ~30nm 
diameters at membrane openings, which may represent side views 
of GIDMD-NT pores. To visualize these potential pore-like struc- 
tures top-down, we used detergent to extract the reconstituted pores 
from liposomes and purified the proteins through a size-exclusion 
column before performing negative staining electron microscopy. 
Stable ring structures with ~15 nm inner and ~32 nm outer diame- 
ters were observed, but only when both caspase-11 and GSDMD were 
added (Fig. 3f). Cleaved interleukin-1(, released from cells undergoing 
pyroptosis”, has a diameter of ~4.5 nm (ref. 21) and could readily pass 
through these pores. 

Pyroptotic cells release cytosolic contents into the surrounding 
media**. We used Flag immunoblot to determine whether pyrop- 
totic HEK293T cells ectopically expressing Flag-GSDMD-NT release 
GSDMD-NT into the culture medium (Fig. 4a). Whereas ectopic Flag- 
GSDMD was only detected in the cell, Flag-GSDMD-NT was mostly 
detected in culture supernatants. To examine the activity of released 
GSDMD-NT, we assessed iB MDM viability after incubation with five- 
fold-concentrated culture supernatants from HEK293T cells ectopically 
expressing Flag-GSDMD-NT or Flag~-GSDMD (Extended Data Fig. 3a). 
Neither supernatant killed iBMDMs. These results were confirmed by 
examining propidium iodide uptake of CFSE-labelled untransfected 
HEK293T cells after incubation with Flag-GSDMD-NT-expressing 
HEK293T cells (Extended Data Fig. 3b). Virtually all of the transfected 
cells died, but none of the bystander cells, consistent with previous 
reports””*. Thus GIDMD-NT does not injure bystander cells—it does 
not disrupt the plasma membrane from the outside, which is expected 
as it only binds to phospholipids present on the inner leaflet of the 
plasma membrane of viable cells (Fig. 3a, b). 

As GSDMD-NT also binds to cardiolipin, we investigated whether 
the concentrated pyroptotic cell supernatant kills bacteria (Fig. 4b). 
The pyroptotic supernatant reduced Escherichia coli colonies in a dose- 
dependent manner. As pyroptotic cell supernatants contain many 
antibacterial factors, including lysosomal enzymes and lysozyme, 
we assessed the anti-bacterial activity of culture supernatants that 
were immunodepleted of Flag-GSDMD-NT (Fig. 4c). Depletion of 
GSDMD-NT inhibited bacterial killing, supporting a direct antibac- 
terial effect of GIDMD-NT. These culture supernatants were con- 
centrated from cells overexpressing GSDMD-NT, an unphysiological 
condition that might have unnaturally exaggerated bacterial killing. 
To examine whether enough endogenous GSDMD-NT is released to 
kill extracellular bacteria, we collected antibiotic-free culture super- 
natants from iB MDMs, transfected with LPS or control Pam3CSK4 
or treated with LPS and nigericin for 3h, and added them at dilutions 
of 1:4 or 1:2 to Listeria monocytogenes. Addition of unconcentrated 
pyroptotic iB MDM culture supernatants significantly reduced bacterial 
colony-forming units (c.f-u.) in a dose-dependent manner (Fig. 4d). 


7 JULY 2016 | VOL 535 | NATURE | 155 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 
Triglyceride © © |Phosphatidylinositol 
Diacylglycerol} © © |Ptdins(4)P 
Phosphatidic acid} © © |Ptdins(4,5)P2 
Phosphatidylserine} © © |Ptdins(3,4,5)P3 
Phosphatidylethanolamine| © © |Cholesterol 
Phosphatidylcholine}_ © © |Sphingomyelin 
Phosphatidylglycerol] © © |Sulfatide 
Cardiolipin] © @ |Blue blank 
c d 
+5% + 15% +5% +5% Ss 120 
Ptdins(4)P PS PI(4,5)P2 PA & 100 
2 80 
PF O'S QF GFF Fore Fore ot orws 8 
SWABS GYSGRN’ GYSSRN’ YAY’ GYdSYN F_60 
GSDMD-NT + + - - t+e-- Ft teo- tHe-o- , 40 
GSDMD-NT 4A - - + + SSL ack de Sao en a he ake aa 
e < f 
& 2 
g 3 
ay as 
oO 1 
® i & 3 
Oo oO 
+ Q a 
n = oO 
oO a oO 
E @ = + 
o 0) = fa 
Q + ® = 
a Q g a 
=) 2 & D 
7 3 o 
3 5 
2 
a 


Figure 3 | N-terminal gasdermin D binds to phosphatidyl serine and 
cardiolipin and forms pores on liposomes. a, b, Membranes displaying 
lipids (a) were incubated with indicated proteins and binding was 
assessed by blotting for GSDMD, perforin or granulysin (b). c, Wild-type 
or 4A-mutant GIDMD-NT binding to PC-PE liposomes containing 
additional indicated phospholipids (molar proportion of added lipid 
indicated) was analysed by SDS-PAGE and GSDMD immunoblot. 

d, Liposome leakage was monitored by terbium (Tb**) fluorescence after 
incubation with the indicated GSDMD protein + caspase-11. Detergent 
was added after 9 min (dotted line). e, Negative staining electron 


To confirm that GIDMD-NT accounts for the anti-bacterial activity, 
we measured E. coli and Staphylococcus aureus c.f.u. after incubation 
with nanomolar concentrations of recombinant GIDMD, GSDMD-CT, 
wild-type or 4A GSDMD-NT, or granulysin (Fig. 4e). Wild-type 
GSDMD-NT strongly inhibited colony formation of both bacteria, but 
the other GSDMD constructs had no anti-bacterial activity. Moreover 
GSDMD-NT was more active than granulysin. The anti-bacterial 
effect was rapid: after only 5 min, bacterial c.f.u. were reduced ~2-fold 
(Fig. 4f). Bacterial growth measurements after treatment with the 
GSDMD proteins confirmed these results (Extended Data Fig. 3c). 
To determine whether GSDMD-NT is bactericidal, we measured 
propidium iodide uptake by E. coli and L. monocytogenes after treat- 
ment for 20 min with the same GSDMD constructs (Extended Data 
Fig. 3d, data not shown). Wild-type GIDMD-NT killed ~80% of bac- 
teria, but 4A GSDMD-NT, GSDMD-CT and GSDMD had no effect. 
We next used spinning disk fluorescence microscopy to visualize 
whether AlexaFluor-488-labelled GSDMD-CT or GSDMD, treated or 
not with caspase-11, bound to mCherry-expressing L. monocytogenes 
(Extended Data Fig. 3e). Only caspase-11-treated GSDMD bound. 
Thus GSDMD-NT, released from pyroptotic cells, rapidly binds to and 
kills both Gram-negative and Gram-positive bacteria. 

Intracellular bacteria trigger pyroptosis when LPS on cytosolic 
Gram-negative bacteria activates the noncanonical inflammasome or 
when invasive Gram-negative or -positive bacteria activate the canon- 
ical inflammasome****®, We first looked at whether ectopic GIDMD 
or wild-type or 4A-mutant GIDMD-NT kills intracellular L. monocy- 
togenes in HeLa cells (Fig. 4g). HeLa cells were infected 6h after trans- 
fection. Although expression of GIDMD or 4A GSDMD-NT had no 
effect, wild-type GIDMD-NT significantly reduced recovery of viable 
bacteria 6h and 12h later. To assess whether cleavage of endogenous 


156 | NATURE | VOL 535 | 7 JULY 2016 


_ 7 GSDMD-NT 

tpe—= GSDMD-NT 4A 

‘) -— GSDMD 

—— GSDMD + 
caspase-11 

—-— Caspase-11 


100 200 300 400 500 600 
Time (s) 


microscopy images of PE-PC-PS liposomes treated with GSDMD (left) or 
caspase-11-activated GSDMD (right). Arrows indicate potential side views 
of GIDMD-NT pores. f, Negative staining electron microscopy images 

of GSDMD-NT pores formed in PS-containing liposomes and extracted 
by detergent. The left image of pores formed by GSDMD and caspase-11 
shows a field with multiple rings, whereas the right images show enlarged 
single rings. The inner and outer diameters (red dotted lines) are 
approximately 15 nm and 32 nm, respectively. Data are representative of 
three independent experiments. Scale bars, 20nm 


GSDMD induces intracellular bacterial killing, we examined the effect 
of inflammasome activation on the survival of intracellular L. mono- 
cytogenes in iBMDMs. When LPS-primed-iBMDMs infected with 
L. monocytogenes were treated with nigericin for 1 h to activate the 
canonical inflammasome, bacterial c.f.u. were reduced ~2-fold (Fig. 4h). 
Infection of iBMDMs with L. monocytogenes also independently trig- 
gers AIM2/ASC/caspase- 1-mediated pyroptosis”*. To assess the impor- 
tance of GIDMD-NT bacterial killing by direct listerial inflammasome 
activation, we examined the effect of Gsdmd knockdown on bacterial 
c.f.u. (Fig. 4i). iBMDMs with Gsdmd knocked down contained three- 
fold more bacteria, indicating that inflammasome activation in infected 
cells causes GIDMD-dependent death of intracellular bacteria. The 
intracellular infection experiments (Fig. 4g—i) were performed with- 
out antibiotics, but similar results were obtained when gentamicin was 
used to kill extracellular bacteria and removed before triggering pyrop- 
tosis (data not shown). Thus, inflammasome activation of GIDMD 
kills both intracellular and extracellular bacteria in vitro. However, 
viable bacteria were not completely eliminated from these cultures. 
GSDMD-NT could reduce intracellular bacteria by causing host cell 
death, expelling bacteria from the intracellular niche that is favourable 
for their survival and replication’, or by a direct anti-bacterial effect. 
We have no experimental method to dissociate eukaryotic cell death 
from bacterial cell death at this time. 

How inflammatory caspase cleavage of GIDMD causes pyroptosis?” 
was previously unknown. Here we show that GIDMD-NT binds to 
membranes containing PS, cardiolipin, or PIPs to form oligomeric 
pores that kill mammalian cells and the bacteria that trigger pyrop- 
tosis. GSDMD-NT is released into the extracellular milieu during 
pyroptosis. Because GSDMD-NT binds selectively to phospholipids 
that are restricted to the inner leaflet of mammalian cell membranes, 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Flag- Flag- b c 120 d 120 
GSDMD-NT GSDMD os os 
=. 499, BSSOMD-NT GSDMD & 100 2 1004 2, 
so oe = © o 2 
y & Yy & 3 = 80 = 80 mM 
CP SK roa 3 100 5 5 é 
g = 60 a = 60 
-75 = 80 2 2 
e\| @ » =o 5. a > a0 
a e 37-40 vs 5 ‘ 5 ; 
= > 20 i 
-25 7 0 Mera S . = LPS -+-- -4-- 
0 125. 25 50. 100 Anti-Flag - - + (kDa) PamsG SI hobs OS 
1 GSDMD-NT > - 37 LPS +nigericin ---+ ---+ 
Medium (ul) IB: Flag 25 Medium (ul) 50 100 
e : f = ; 
420 WME. coli S. aureus 4120 E. coli S. aureus 
8 400 * 3 100 
& 1 * ro 
£ 80 £ 80 5‘ 
s§ en ig? ie 5 - ee a 
oS = o we . 
& 40 raed & 40 
a +e — ee 
20 i 2 2 20 
ie) | ro) 
0 1 r rr r T r r 1 T r 1 T 0 T , ; 
—=_——— 
GSDMD-NT (nM) - 50 100 200 400 - - - - - =~ = = GSDMD-NT = + = + 7 + 
GSDMD-NT4A(nM) - == = = = 400 Time (min) 5 10 20 
GSDMD-CT(nM) - = - - - = = 400 - - = = = = 
GSDMD(nM) - - - - - - = 40 - - - = = 
Granulysin (nM) - = - $= == = = = = 5g 400 200 400 1,000 
9g h i ary? 
3 120 oh 120 eX 3 400 x 
aa) mich s RS a) 
® = 100 no} OS Ls 
a8 2h £ 90 SP % S 300 
Sg 80 g si 
£8 = GSDMD = gg 28 
3 604] | 5 60 B & 200 
8 i 
5 5 4D a g Tubulin =" se" 50 58 
82 5 30 = 100 
SF 20 5 2* 
im 
2) 0 : ) ° ) 
$ © ‘a icin - i s 
~S ye Nigericin + Control siRNA + 
we Ss ControlGsdmd - + 
S So eo 
eS Ss 
S 
o 


Figure 4 | GSDMD-NT kills bacteria. a, Culture supernatants and 
whole-cell lysates (WCL) of HEK293T cells, transiently expressing Flag— 
GSDMD-NT or Flag-GSDMD for 20h, were analysed by Flag immunoblot 
of reducing gel. b, Antibiotic-free culture supernatants (concentrated 
fivefold) from transfected HEK293T cells, collected 30h after transfection, 
were added to E. coli, which were cultured at 37°C in 2001 final volume 
for 30 min before measuring c.f.u. c, The concentrated culture medium 
from Flag-GSDMD-NT-expressing HEK293T cells was immunodepleted 
with anti-Flag or control IgG, before adding to E. coli, as in b. Lower panel, 
Flag immunoblot. d, L. monocytogenes were incubated at 37 °C for 30 min 
with antibiotic-free culture supernatants from iBMDMs, transfected with 
LPS or Pam3CSK4 or incubated with LPS and nigericin for 3h, before 
assessing c.f.u. e, f, E. coli and S. aureus were treated with the indicated 


GSDMD-NT does not kill bystander cells. This selective activity should 
control tissue damage. GIDMD-NT killing of intracellular bacteria 
should limit the release of viable bacteria from pyroptotic cells and 
reduce the spread of infection. We do not know whether GIDMD-NT 
is active only against bacteria that have escaped from the phagosome. 
Because the phagosome outer leaflet derives from the inner leaflet of 
the plasma membrane, phagosomes could be targeted by GIDMD-NT, 
providing a mechanism for lysis of bacteria within phagosomes. Released 
GSDMD-NT is active on extracellular bacteria, which probably also 
helps to control infection. In vivo experiments to show that GIDMD- 
mediated bacterial pore formation protects against bacterial infection 
would be needed to determine whether direct bacterial killing is phys- 
iologically important. However, we do not have a way to distinguish 
in vivo direct killing of bacteria from killing of the host cell (pyroptosis), 
as the mechanisms that disrupt one also disrupt the other. 

A better understanding of how GSDMD-NT forms pores and a more 
complete description of the GIDMD-NT pore could be obtained by 
solving the structures of monomeric and oligomerized GIDMD-NT. 
The oligomers formed in cells overexpressing GSDMD-NT on native 
gels (Fig. 1a) appeared to be heterogeneous in size, whereas the purified 
reconstituted pores (Fig. 3f) appeared homogeneous. Direct visuali- 
zation of the pores formed on cellular membranes should determine 


proteins for 20 min (e) or with 200 nM wild-type GSDMD-NT for the 
indicated times (f) before measuring c.f.u. g, HeLa cells, transfected for 

6h to express the indicated proteins, were infected with L. monocytogenes 
for the indicated times before cells were lysed to analyse intracellular c.f.u. 
h, LPS-primed-iBMDMs infected with L. monocytogenes, were treated 

or not with nigericin for 1h before bacteria were collected and c.f.u. was 
analysed. Nigericin had no effect on cell-free bacteria (not shown). 

i, iBMDMs, transfected with control or Gsdmd siRNA, were infected with 
L. monocytogenes and assessed for intracellular c.f.u. 12h later. GSDMD 
immunoblot, left. Shown are mean + s.d. of triplicates of one experiment 
of three (b, d-f, h) or two (c, g, i) independent experiments. Statistical 
differences compared to untreated control samples (two-tailed t-test); 
*P<0.05, **P< 0.01. 


whether the pores are uniform. Our identification of mutations that 
inactivate pore formation, but probably do not affect its overall struc- 
ture, should help to assess the importance of GIDMD-NT pores in 
controlling in vivo infection. 

Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 14 March; accepted 9 June 2016. 


1. Shi, J. et al. Cleavage of GSDMD by inflammatory caspases determines 
pyroptotic cell death. Nature 526, 660-665 (2015). 

2. Kayagaki, N. et al. Caspase-11 cleaves gasdermin D for non-canonical 
inflammasome signalling. Nature 526, 666-671 (2015). 

3. Rathinam, V. A. et al. TRIF licenses caspase-11-dependent NLRP3 
inflammasome activation by gram-negative bacteria. Ce// 150, 606-619 
(2012). 

4. Hagar, J.A, Powell, D. A. Aachoui, Y., Ernst, R. K. & Miao, E. A. Cytoplasmic LPS 
activates caspase-11: implications in TLR4-independent endotoxic shock. 
Science 341, 1250-1253 (2013). 

5. Law, R.H. et a/. The structural basis for membrane binding and pore formation 
by lymphocyte perforin. Nature 468, 447-451 (2010). 

6. Montal, M. Design of molecular function: channels of communication. Annu. 
Rev. Biophys. Biomol. Struct. 24, 31-57 (1995). 

7. Rosado, C. J. et al. The MACPF/CDC family of pore-forming toxins. Cell. 


Microbiol. 10, 1765-1774 (2008). 


7 JULY 2016 | VOL 535 | NATURE | 157 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


22. 
23. 


Geourjon, C. & Deléage, G. SOPMA: significant improvements in protein 
secondary structure prediction by consensus prediction from multiple 
alignments. Comput. Appl. Biosci. 11, 681-684 (1995). 

Bordier, C. Phase separation of integral membrane proteins in Triton X-114 
solution. J. Biol. Chem. 256, 1604-1607 (1981). 


. Dotiwala, F. et a/. Killer lymphocytes use granulysin, perforin and granzymes to 


kill intracellular parasites. Nat. Med. 22, 210-216 (2016). 


. Walch, M. et al. Cytotoxic cells kill intracellular bacteria through granulysin- 


mediated delivery of granzymes. Cel! 157, 1309-1323 (2014). 


. van Meer, G., Voelker, D. R. & Feigenson, G. W. Membrane lipids: where they are 


and how they behave. Nat. Rev. Mol. Cell Biol. 9, 112-124 (2008). 


. Leventis, P. A. & Grinstein, S. The distribution and function of phosphatidylserine 


in cellular membranes. Annu. Rev. Biophys. 39, 407-427 (2010). 


. Schlame, M. Cardiolipin synthesis for the assembly of bacterial and 


mitochondrial membranes. J. Lipid Res. 49, 1607-1620 (2008). 


. Pasparakis, M. & Vandenabeele, P. Necroptosis and its role in inflammation. 


Nature 517, 311-320 (2015). 


. Dondelinger, Y. et al. MLKL compromises plasma membrane integrity 


by binding to phosphatidylinositol phosphates. Ce// Reports 7, 971-981 
(2014). 


. Wang, H. et al. Mixed lineage kinase domain-like protein MLKL causes 


necrotic membrane disruption upon phosphorylation by RIP3. Mol. Cell 54, 
133-146 (2014). 


. Sun, L. et al. Mixed lineage kinase domain-like protein mediates necrosis 


signaling downstream of RIP3 kinase. Ce// 148, 213-227 (2012). 


. Wilschut, J. & Papahadjopoulos, D. Ca?*-induced fusion of phospholipid 


vesicles monitored by mixing of aqueous contents. Nature 281, 690-692 
(1979). 


. He, W. T. et al. Gasdermin D is an executor of pyroptosis and required for 


interleukin-18 secretion. Cell Res. 25, 1285-1298 (2015). 


. Finzel, B.C. et al. Crystal structure of recombinant human interleukin-18 at 


2.0 A resolution. J. Mol. Biol. 209, 779-791 (1989). 

Lamkanfi, M. & Dixit, V. M. Mechanisms and functions of inflammasomes. 

Cell 157, 1013-1022 (2014). 

Ruhl, S. & Broz, P. Caspase-11 activates a canonical NLRP3 inflammasome by 
promoting K* efflux. Eur. J. Immunol. 45, 2927-2936 (2015). 


158 | NATURE | VOL 535 | 7 JULY 2016 


24. Warren, S. E., Mao, D. P., Rodriguez, A. E., Miao, E. A. & Aderem, A. Multiple 
Nod-like receptors activate caspase 1 during Listeria monocytogenes infection. 
J. Immunol. 180, 7558-7564 (2008). 

25. Cervantes, J., Nagata, T., Uchijima, M., Shibata, K. & Koide, Y. Intracytosolic 
Listeria monocytogenes induces cell death through caspase-1 activation in 
murine macrophages. Cell. Microbiol. 10, 41-52 (2008). 

26. Aachoui, Y. et al. Caspase-11 protects against bacteria that escape the vacuole. 
Science 339, 975-978 (2013). 

27. Wu, J., Fernandes-Alnemri, T. & Alnemri, E. S. Involvement of the AIM2, NLRC4, 
and NLRP3 inflammasomes in caspase-1 activation by Listeria 
monocytogenes. J. Clin. Immunol. 30, 693-702 (2010). 

28. Sauer, J. D. et al. Listeria monocytogenes triggers AIM2-mediated pyroptosis 
upon infrequent bacteriolysis in the macrophage cytosol. Cell Host Microbe 7, 
412-419 (2010). 

29. Miao, E. A. et al. Caspase-1-induced pyroptosis is an innate immune effector 
mechanism against intracellular bacteria. Nat. Immunol. 11, 1136-1142 (2010). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements This work was supported by US NIH grant RO1AI123265 
(J.L.). 


Author Contributions X.L. conceived the study. X.L., Z.Z., J.R., H.W. and J.L. 
designed the experiments and analysed the data. Experiments were performed 
as follows (X.L., Figs la—d, h, 2, 3a—d, 4a, c-f, h, Extended Data Fig. la-c, e, f; Z.Z. 
Figs le-g, 3b, d, Fig, 4b, e-h, Extended Data Fig. 1d; J.R. Figs 3, 4e, f; Y.P. Fig. 2e, f; 
V.M. Fig. 3e). X.L., H.W. and J.L. wrote the manuscript. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial interests. 
Readers are welcome to comment on the online version of the paper. 
Correspondence and requests for materials should be addressed to 

H.W. (wu@crystal.harvard.edu) or J.L. (judy.lieberman@childrens.harvard.edu). 


Reviewer Information Nature thanks F. Sigworth and the other anonymous 
reviewer(s) for their contribution to the peer review of this work. 


© 2016 Macmillan Publishers Limited. All rights reserved 


METHODS 


Data reporting. No statistical methods were used to predetermine sample size. 
The experiments were not randomized and the investigators were not blinded to 
outcome assessment. 

Cell lines and bacterial strains. HEK293T and HeLa cells were obtained from 
ATCC, and C57BL/6 mouse iBMDM cells were provided by J. Kagan (Boston 
Children’s Hospital). Cells were cultured in DMEM (Invitrogen) with 10% heat- 
inactivated foetal bovine serum, supplemented with 100 U ml’ penicillin G, 
100p1g ml"! streptomycin sulphate, 6mM HEPES, 1.6mM L-glutamine, and 501M 
2-mercaptoethanol (2ME). There were no antibiotics in the cell culture medium 
used for bacterial infection and for experiments in which culture supernatants were 
collected for bacterial incubation. Cells were verified to be free of mycoplasma 
contamination. Transient transfection of HEK293T and HeLa cells was performed 
using the calcium phosphate method or Lipofectamine 2000 (Invitrogen) according 
to the manufacturer’s instructions. iB MDM cells were transfected by nucleofection 
(Amaxa) using the Amaxa Nucleofector kit (VPA-1009). Bacterial strains were 
obtained from ATCC (E. coli strain BL21, S. aureus strain CA-MRSA USA300 and 
L. monocytogenes 104038 strain) and grown in Luria broth (LB), tryptic soy broth 
and brain-heart infusion media, respectively. 

Reagents. Polyclonal anti-human GSDMD was from Novus Biologicals (NBP2- 
33422) or Proteintech (20770-1-AP). Monoclonal anti-haemagglutinin (F-7) 
antibody (sc-7392) was from Santa Cruz Biotechnology. Monoclonal anti-Flag 
M2 antibody (F1804), monoclonal anti-human «-tubulin antibody (T5168) and 
monoclonal anti-mouse caspase-11 antibody (C1354) were from Sigma-Aldrich. 
Polyclonal anti-human Na*/K*-ATPase al (ATP1A1) antibody (#3010) was 
from Cell Signaling. c- MYC (9E10) monoclonal antibody (MMS-15P) was from 
Covance. Monoclonal anti-human perforin antibody (3465-6-250) and polyclonal 
anti-human granulysin antibody (AF3138) were from Mabtech and Novus, respec- 
tively. N-ethylmaleimide, 2ME, DTT, terbium(III) chloride, DPA (dipicolinic acid) 
and nigericin were from Sigma-Aldrich. Ultrapure LPS and Pam3CSK4 were from 
InvivoGen. The complete protease inhibitor cocktail was from Roche. siRNA 
duplexes targeting Gsdmd (s87492; 5’-GGUGAACAUCGGAAAGAUUTT-3’) 
and the nonspecific control siRNA (CTL, 4390843) were from Ambion. 
Plasmids. pCMV6-Gsdmd and pCMV-Flag-caspase-11 constructs were obtained 
from Origene and Addgene, respectively. cDNA for Gsdmd was subcloned into 
pFlag-CMV4, pcDNA3-N-HA and pcDNA3-C-5xMyc. GSDMD truncation 
mutants were derived by PCR from the corresponding plasmids. All point muta- 
tions were generated using QuikChange XL site-directed mutagenesis (Stratagene). 
All plasmids were verified by sequencing. 

Protein expression and purification. Full-length human GSDMD was cloned into 
the pDB.His.MBP vector with a tomato etch virus (TEV)-cleavable N-terminal 
Hiss-MBP tag using NdeI and Xho I restriction sites. 4A GSDMD, GSDMD-CT, 
and wild-type and 4A-mutant GSDMD were constructed by QuikChange 
Mutagenesis (Agilent Technologies). For expression and purification of full-length 
GSDMD, GSDMD-CT and GSDMD-NT 4A mutant, E. coli BL21 (DE3) cells har- 
bouring the indicated plasmids were grown in LB medium supplemented with 
50g ml! kanamycin. Protein expression was induced at 18°C overnight by 
0.5 mM isopropyl-6-p-thiogalactopyranoside (IPTG) when ODgoo reached 0.8. 
Cells were collected and resuspended in lysis buffer containing 25 mM Tris-HCl 
(pH 8.0), 150mM NaCl, 20 mM imidazole and 5mM 2ME, and lysates were 
homogenized by ultrasonication. The cell lysate was clarified by centrifugation at 
40,000g at 4°C for 1h. The supernatant containing the target protein was incubated 
with Ni-NTA resin (Qiagen) that was pre-equilibrated with lysis buffer for 30 min 
at 4°C. After incubation, the resin-supernatant mixture was poured into a column 
and the resin was washed with lysis buffer. The protein was eluted using the lysis 
buffer supplemented with 500 mM imidazole. The Hiss-MBP-tagged protein was 
further purified by HiTrap Q ion-exchange and Superdex 200 gel-filtration chro- 
matography (GE Healthcare Life Sciences). The Hiss- MBP tag was removed by 
overnight TEV protease digestion at 16°C. The cleaved protein was purified using 
HiTrap Q ion-exchange and Superdex 200 gel-filtration columns. 

The yield of wild-type GIDMD-NT was lower than for the other constructs 
because it inserted into bacterial membranes and killed >50% of bacteria after 
overnight expression, and thus required a different purification strategy. To purify 
wild-type GIDMD-NT, cells containing the pDB-Hiss- MBP-GSDMD-NT clones 
were grown and induced as described for full-length GSDMD. They were collected 
and resuspended in lysis buffer containing 25 mM Tris-HCl (pH 8.0) and 150 mM 
NaCl, and homogenized by ultrasonication. The membrane fraction was harvested 
by ultracentrifugation at 200,000g at 4°C for 1 h and resuspended and solubilized 
with 1.0% n-dodecyl-f-p-maltoside (DDM) in lysis buffer supplemented with 
20mM imidazole using a glass homogenizer, followed by centrifugation at 200,000g 
at 4°C for 45 min. The supernatant containing the solubilized protein was incu- 
bated for 30 min at 4°C with Ni-NTA resin (Qiagen) that was pre-equilibrated with 
the lysis buffer containing 1% DDM and 20 mM imidazole. The resin was washed 


LETTER 


with lysis buffer containing 1% DDM and 20 mM imidazole, and the recombi- 
nant proteins were eluted with lysis buffer supplemented with 500 mM imidazole 
and 0.1% DDM. The Hiss-MBP-GSDMD-NT protein was further purified using 
a Superdex 200 gel-filtration column. For protein used for bacterial growth assay, 
the Ni-NTA resin was washed with at least 40 column volumes of lysis buffer 
without detergent before elution and the protein was further purified with deter- 
gent-free buffers. 

The caspase- 11 gene was cloned into the pFastBac-HTa vector with a TEV cleavable 
N-terminal His¢-tag using EcoRI and Xhol restriction sites. The baculoviruses 
were prepared using the Bac-to-Bac system (Invitrogen), and the protein was 
expressed in Sf9 cells following the manufacturer's instructions. His—caspase-11 
baculovirus (10 ml) was used to infect 1] of Sf9 cells. Cells were collected 48h 
after infection and His—caspase-11 was purified following the same protocol as for 
Hise-MBP-GSDMD. After elution from Ni-NTA resin, the protein was further 
purified using a Superdex 200 gel-filtration column. Aggregated fractions, which 
were the activated form of caspase-11, were collected for use in subsequent assays. 
For some experiments GSDMD-NT was generated by mixing caspase-11 with 
GSDMD at a 1:3 mass ratio for indicated times at 16°C. 

Native human granulysin and perforin were purified from isolated YT-Indy 
cytotoxic granules as previously described*”. 

GSDMD thermal shift assay. Experimental protein unfolding was monitored 
by fluorescence of the Protein Thermal Shift Dye (Thermo Fisher Scientific) as 
temperature was continuously increased at a ramp rate of 1.6°C per min using 
an Applied Biosystems StepOne Real-Time PCR machine. Samples of wild-type 
and 4A-mutant MBP-GSDMD were subdivided into three 20,11 replicates on a 
MicroAmp Optical 96-Well reaction plate. The transition thermal melting temper- 
atures (Tm) were extracted using Applied Biosystems StepOne software version 2.3. 
Immunoblot and immunoprecipitation. Cells were lysed in lysis buffer (50 mM 
Tris-Cl (pH 7.4), 150 mM NaCl) supplemented with 1% Triton-X100, 1mM PMSF. 
Cell lysates were boiled in SDS loading buffer for 5 min before electrophoresis 
through SDS-PAGE gel. The resolved proteins were then transferred to a polyvi- 
nylidene difluoride (PVDF) membrane (Millipore), which was probed with the 
indicated antibodies. Protein bands were visualized using a SuperSignal West Pico 
chemiluminescence ECL kit (Pierce). For non-reducing gels, cells were lysed in 
lysis buffer with or without 30 mM N-ethylmaleimide and cell lysates were pre- 
pared with 2ME-free SDS loading buffer. For immunoprecipitations, cell extracts 
were prepared using RIPA buffer (50 mM Tris-HCl (pH 7.4), 150 mM NaCl, 1mM 
EDTA, 1% Triton X-100, 0.1% SDS, 0.5% deoxycholate) containing complete pro- 
tease inhibitor cocktail. Lysates were incubated with the relevant antibody for 4h 
at 4°C before adding protein A/G agarose for 2h. Beads were washed three times 
with the same buffer and bound proteins were eluted with SDS loading buffer by 
boiling for 5 min. 

Native gel immunoblot. Cells samples, prepared using NativePAGE Sample Prep 
Kit (Invitrogen), were electrophoresed through a 4~16% NativePAGE Bis-Tris gel 
(Invitrogen) in NativePAGE running buffer (Invitrogen) at 4°C and 150 V. Proteins 
were then transferred to a PVDF membrane at 0.2 A for 1h in NativePAGE transfer 
buffer (Invitrogen) for immunoblotting. 

Cytotoxicity assays. Cell death and cell viability were determined by the lactate 
dehydrogenase release assay using CytoTox 96 Non-Radioactive Cytotoxicity Assay 
kit (Promega) and by measuring ATP levels using the CellTiter-Glo Luminescent 
Cell Viability Assay (Promega), respectively, according to the manufacturer’s 
instructions. 

Triton X-114 phase separation. Cells were lysed in lysis buffer (20 mM HEPES 
(pH 7.4), 150mM NaCl, 2% Triton X-114 (Sigma), complete protease inhibitor) 
and then centrifuged at 15,000g for 15 min. The resultant supernatant mixture was 
incubated at 30°C for 10 min to separate the upper aqueous fraction from the lower 
detergent soluble fraction. The aqueous fraction was spun at 1,500g for 5 min at 
room temperature and the upper fraction harvested to eliminate contamination 
from the detergent-enriched phase. The detergent-enriched phase was diluted with 
lysis buffer lacking Triton X-114 and re-spun at 1,500g for 10 min and the detergent 
phase was recollected. The washed detergent phase was diluted with lysis buffer 
lacking Triton X-114 to the same final volume as the aqueous faction. 

Cell fractionation. Cells were washed with PBS and collected by scraping in PBS 
on ice. Then cells were washed once in PBS and resuspended in five cell volumes 
of buffer A (20mM HEPES (pH 7.4), 40mM KCl, 1.5mM MgCh, 1mM EDTA, 
1mM EGTA, 0.1mM PMSF and 250 mM sucrose, 1 x protease inhibitors). Cells 
were then incubated for 30 min on ice in buffer A, and lysed by passage through a 
22-gauge needle 30 times. Lysates were spun at 800g for 10 min to remove unbroken 
cells and nuclei. The post-nuclear supernatant was spun at 7,000g for 10 min, and 
the supernatant (S7) was re-spun at 20,000g for 10 min, whereas the pellet (P7) 
was resuspended in the same volume of buffer A. The resulting pellet (P20) was 
again resuspended in buffer A and the supernatant (S20) was re-spun at 100,000g 
for 1h and the resulting supernatant (S100) was collected and the pellet (P100) was 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


resuspended in the same volume of buffer A as before. All the pellets containing 
membrane proteins were washed with buffer A. To separate soluble and crude 
membrane fractions, cells were lysed in buffer A and intact cells, nuclei and cell 
debris were removed by centrifugation of the homogenate at 800g for 10 min at 
4°C and then the supernatant was spun at 100,000g for 1h at 4°C. The supernatant 
containing cytosolic proteins (soluble fraction) was collected. The pellets were 
washed with buffer A and were re-centrifuged at 100,000g for 1h. The precipitate 
was the crude membrane fraction. 

Immunostaining and confocal microscopy. Cells grown on coverslips were 
fixed for 15 min with 4% paraformaldehyde in PBS, permeabilized for 20 min in 
0.1% Triton X-100 in PBS and blocked using 5% BSA for 1h. Then, the cells were 
stained with the indicated primary antibodies, followed by incubation with fluo- 
rescent-conjugated goat anti-mouse IgG (Invitrogen). Nuclei were counterstained 
with DAPI (Cell Signaling). Slides were mounted using Fluorescence Mounting 
Medium (Dako). Images were captured at room temperature using a confocal 
microscope (Olympus Fluoview FV1000 Confocal System) with a 63 x water 
immersion objective and Olympus Fluoview software (Olympus). All confocal 
images are representative of three independent experiments. 

Protein-lipid binding assay. Proteins were spotted on Membrane Lipid Strips 
(Echelon Biosciences) according to the manufacturer’s instructions. To block 
non-specific binding, lipid strips were preincubated with binding assay buffer 
(3% fatty acid-free BSA (Sigma) in PBS) for 1h at room temperature. Then the 
strips were incubated with protein (2|.g ml!) diluted in binding assay buffer 
for 1h at room temperature and then washed three times (6 min each time) with 
wash buffer (0.1% Tween-20 in PBS). Membrane-bound proteins were detected by 
probing the lipid strips with corresponding primary antibodies diluted in binding 
assay buffer for 1h at room temperature, followed by incubation for 1h with horse- 
radish-peroxidase-conjugated secondary antibody diluted 1:2000 in binding assay 
buffer. After washing three times with wash buffer, proteins were visualized using 
a SuperSignal West Pico chemiluminescence ECL kit (Pierce). 

Liposome binding assay. Liposomes were prepared by hydration of lipids (Avanti 
Polar Lipids) in buffer R (20mM HEPES (pH 7.4), 150mM NaCl) followed by 
extrusion through a 100-nm polycarbonate membrane (~24 passages). All lipos- 
omes were composed of a 4:1 molar ratio of distearoyl PC and PE, to which we 
added other phospholipids (PtdIns(4)P, PtdIns(4,5)P2, PS, PA) as indicated. For 
liposome binding assay, protein (0.1|1M) was incubated with liposomes in 10011 
of buffer R for 20 min at room temperature before sedimentation at 140,000g 
for 20 min at 4°C. Supernatants were removed immediately and the pellets were 
washed twice with buffer R and then resuspended in an equal volume of buffer. 
Proteins in both pellets and supernatant were then analysed by SDS-PAGE and 
immunoblot. 

Liposome leakage assay. The leakage of liposomes encapsulating TbCl; was 
determined by an increase in fluorescence intensity when Tb**+ bound to DPA 
in the external buffer. Tb*+-entrapped liposomes were prepared by hydration 
of the indicated lipids in buffer R containing 50 mM sodium citrate and 15mM 
TbCl;. Liposomes were washed twice to remove unincorporated TbCl. Then, Tb?* 
entrapped liposomes were suspended in 10011 buffer R supplemented with 50 11M 
of DPA and indicated GSDMD recombinant proteins were added. Fluorescence at 
490 nm after excitation at 276 nm was continuously recorded for 9 min at 205 inter- 
vals using a Biotek Synergy plate reader. At the end of the incubation, 0.1% Triton 
X-100 was added to the medium to measure complete release of Tb**. The extent 
of liposome leakage was calculated by using the formula R: t (%) = 100 x ((F;— Fyo)/ 
(Fr100 — Fin), Where Fo is the initial fluorescence of the Tb** liposomes in the DPA- 
containing buffer at the time GSDMD recombinant proteins were added, F; is the 
fluorescence signal recorded at individual time points, and F199 is the mean of the 
top-three fluorescence reads after adding 0.1% Triton X-100. 

Preparation of unilamellar liposomes for reconstitution of GIDMD-NT pores. 
Synthetic 1,2-dioleoyl-sn-glycero-3-(phospho-L-serine) (DOPS), 1-palmitoyl- 
2-oleoyl-sn-glycero-3-phosphocholine (POPC) and 1,2-dioleoyl-sn-glycero- 
3-phosphoethanolamine (DOPE) (Avanti Polar Lipids) dissolved in chloroform 
were mixed in a glass tube at a mass ratio of 5:5:1, and the solvent was evaporated 
under a stream of N> gas. A buffer composed of 25 mM Tris-HCl (pH 8.0) and 
150 mM NaCl was added to yield a final lipid concentration of 5 mM. The lipid 
suspension was then vortexed continuously for 5 min. To obtain unilamellar 
vesicles, liposomes were extruded with 21 passes through a mini-extruder device 
(Avanti) using membranes with 100 nm pores. 

Reconstitution of GSDMD-NT pores on PS-containing liposomes. 
PS-containing liposomes (2|1mol) were incubated with 1 mg full length GIDMD 
and 0.3 mg caspase-11 for 4h at 16°C. After incubation, the liposome-protein 
suspension was collected by ultracentrifugation at 60,000g for 30 min at 4°C. 
The pellets were washed twice with lysis buffer and then resuspended in 50011 
lysis buffer containing 0.5% C12E8 (Anatrace). After centrifugation for 30 min at 


60,000g, the supernatant containing the solubilized GIDMD-NT pores was loaded 
in running buffer containing 0.5% C12E8 onto a Sepharose-6 gel filtration column. 
Fractions were analysed by SDS-PAGE and the fractions containing GIDMD-NT 
were pooled and imaged using negative staining electron microscopy. 

Negative staining electron microscopy of GIDMD-NT pores. Copper grids 
(Electron Microscopy Sciences) coated with a layer of thin carbon were rendered 
hydrophilic immediately before use by glow-discharge in air with 25 mA current 
for 1 min. Liposome-protein suspensions (511), or protein samples extracted from 
liposomes were loaded onto the grids, air dried for ~1 min and blotted, leaving a 
thin layer of sample on the grid surface. The grids were floated on a drop of stain- 
ing solution containing 2% uranyl acetate for 60s. After air drying, the grids were 
examined using a Tecnai G? Spirit BioT WIN electron microscope. 

Bacterial growth assay. Colony-forming unit assays and turbidimetry were used 
to measure bacterial growth as previously described!!. Briefly, for turbidimetry, 
bacteria were diluted (1:100) in bacterial culture medium following treatment and 
incubated with discontinuous shaking at 37°C in a 20011 volume in flat-bottomed 
96-well plates. Growth curves were monitored by reading absorbance at 600 nm 
over 16h using a Spectra MAX 340 (Molecular Devices) or Synergy H4 Hybrid 
Multi-Mode Microplate Reader (BioTek). The time until the growth curves reached 
a threshold OD¢00 of 0.05 above background was defined as the Tyhreshola- The 
ratio of Threshold (untreated): Tyhreshoia (treated) was used to quantify the change 
in bacterial growth. 

LIVE/DEAD assay. Bacterial viability was assessed using the bacterial LIVE/ 
DEAD assay (Invitrogen), following the manufacturer’s recommendations. Briefly, 
bacteria were treated in the presence of 541M Syto-9 (Invitrogen) and 151M 
propidium iodide (Invitrogen). Treatment with 70% isopropanol served as a 
positive control. Fluorescence was visualized by confocal microscopy. 
Fluorescent protein labelling and protein binding assay. Full-length and 
C-terminal GSDMD was labelled with AlexaFluor-488 using the Molecular 
Probes protein labelling kit. An aliquot of the labelled full-length protein was acti- 
vated by incubating with active caspase-11 for 15 min at 37°C. L. monocytogenes 
expressing mCherry (a gift from J. Theriot, Stanford Medical School) were treated 
with PBS or with 500nM AF488-labelled GSDMD that had been activated or not 
with caspase-11 or with GSDMD-CT for 30 min at 37°C. Bacteria were washed 
with 10 mM arginine in PBS for 10 min before fixation in 2% formalin in PBS. 
Slides were mounted with fluorescence mounting medium (Dako) and imaged 
using a fully motorized Axio Observer spinning disk microscope (Carl Zeiss 
Microimaging, Inc.) equipped with a cooled electron multiplication CCD camera 
with 512 x 512 resolution (Photometrics QuantEM, Tuscon, AZ) with excitation 
filters set at 405, 488, 561 and 640 nm and emission filter ranges of 430-475, 500-550, 
589-625 and 680 nm long-pass, respectively. Images were analysed using SlideBook 
V5.0 (Intelligent Imaging Inc.) software. Third-dimensional image stacks were 
obtained along the z axis using the 63 x oil immersion objective by acquiring 
sequential optical planes spaced 0.25 1m apart. Raw images were deconvolved 
using SlideBook. 

Treatment of extracellular bacteria. HEK293T cells, cultured in antibiotic-free 
medium, were transfected with the indicated plasmids for 30h before culture 
supernatants were collected and concentrated fivefold. iB MDMs, cultured in 
antibiotic-free medium, were transfected with LPS or Pam3CSK4 or incubated 
with LPS and nigericin for 3h before culture supernatants were collected and 
used without concentration. Exponential phase bacteria were treated with the 
indicated antibiotic-free culture supernatants or recombinant proteins, which were 
cultured at 37°C for the indicated time. Treated bacteria were diluted in LB and 
plated on LB (E. coli, S. aureus) or brain-heart infusion agar (L. monocytogenes) 
agar plates to determine c.f.u., which were normalized to c.f-u. in control conditions. 
Intracellular bacterial killing assay. HeLa cells or iB MDMs were transfected with 
indicated plasmids or siRNAs. Cells were infected with L. monocytogenes (mul- 
tiplicity of infection, 10:1) 6h after transfection of plasmids or 48h after siRNA 
transfection. Cell plates were centrifuged at 1500 r.p.m. for 10 min, and placed at 
37°C for 30 min before washing to remove extracellular bacteria. Cells were lysed 
using 0.1% Triton-X100 at indicated time points after infection and supernatants 
were collected to determine bacterial titers by c.f.u. assay. For the nigericin exper- 
iment, iBMDMs were primed for 4h with LPS (100 ng ml~!) and then infected 
with L. monocytogenes as described above. After removing extracellular bacteria 
by washing, cells were treated or not with nigericin (20,1M) and Lh later bacteria 
were collected as described above for c.f.u. assay. 

Statistics. Student's t-test (two-tailed) was used for the statistical analysis of all 
experiments. P values <0.05 were considered significant. 


30. Thiery, J., Walch, M., Jensen, D. K., Martinvalet, D. & Lieberman, J. Isolation of 
cytotoxic T cell and NK granules and purification of their effector proteins. 
Curr. Protoc. Cell Biol. 47, 3.37:3.37.1-3.37.29 (2010). 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a b 


Flag-GSDMD-NT Flag-GSDMD 
N-ethylmaleimide - - + + N-ethylmaleimide - —- + + 
2ME -—- + -—~ + (kDa) 2ME —- + —~ + (kDa) 


oligomer > a 
-150 
> > dimer > 
i ra -100 
co a 
monomer > 
monomer > 25 
c d 
_ Flag-CT 
HA-GSDMDNT - + + CT-5xMyc 
Flag-GSDMDNT + -— _ + _ (kDa) Flag-NT 
hye 
> | Flag-NT> 37 
Te 
co | Flag-CT> 
2 
e a f 
100 
= 60 4 x 80 pe 
= < 
o = 60 
5 404 3 
7) = 40 
0 0 
Flag-GSDMD + + + —- -— & Bs ri 
Caspase-11WT - + - + - Le L ¥ 
Caspase-11C254A - - + - + ys oe 
Extended Data Figure 1 | GSDMD-NT oligomerizes and induces (CT-MYC) or Flag~-GSDMD-CT (Flag-CT), which accounts for the 


pyroptosis. a, b, HEK293T cells, transfected with Flag-GSDMD-NT (a) or __ relative weak intensity of the corresponding bands on the middle blot. 
Flag-GSDMD (b), were lysed with or without N-ethylmaleimide or 2ME, e, HEK293T cells, transiently transfected with the indicated plasmids, 
and analysed by SDS-PAGE and Flag immunoblot. c, Lysates of HEK293T _ were assessed 16h after transfection for cell death by CytoTox96 assay. 
cells, transfected with HA-GSDMD-NT and/or Flag-GSDMD-NT, were f, Immortalized iBMDMs expressing Flag-GSDMD were electroporated 
immunoprecipitated with anti-HA and analysed by immunoblot with the with PBS, ultra LPS or Pam3CSK4, as a negative control for pyroptosis. 
indicated antibodies. d, HEK293T cells were transfected with the indicated 2h later, cell death was determined by CytoTox96 assay. Graphs show the 
plasmids. Cell lysates were immunoprecipitated with anti-Flag and mean + s.d. of triplicate wells and data shown are representative of three 
analysed by immunoblot with the indicated antibodies. Flag-GSDMD-NT __ independent experiments. **P < 0.01 (two-tailed t-test). 

(Flag-NT) was expressed at considerably lower levels than GIDMD-CT-MYC 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a b 


R248A, K249A 


c d 


= 

fo>) 
* 
* 


—— _ CTLsIRNA + - - - 
ra £ O14 5 si-Gsdmd - + + + 
on50= Empty vector - + - 
S8e ad 
Zo 208 GSDMDWT - - + - 
= a = GSDMD 4A - - - + (kDa) 
3Z2z 
@we04 
58 7 GSDMD 


0.0 
CTLsiRNA- + 

si-Gsdmd 
Empty vector - 
GSDMD WT~ - - 
GSDMD 4A - - - + 


Tubulin 


+e 
Pe 


37 
25 


IP: Flag 
HA-GSDMD-NT - + + -—- —- — 7 
Flag-GSDMD-NT + - + - - = x 
HA-GSDMD-NT 4A - - - - + + $ 
Flag-GSDMD-NT 4A - - - + — + (kDa) 2 
a 
IB: HA (6) 
+ | IB:HA 2 
E g2% 5 5% 
= | IB: Flag 2 ze a8 
<< 
oom 
BE 
. eau ee ee 
a Flag-GSDMD 
Flag-GsoMD-NTS 8 85 5 5 8 oa IB: Flag 
oligomer> 250 
, Tubulin 


150 
100 
75 


IB: Flag 


50 
37 


25 


monomer > 


Extended Data Figure 2 | Mutation of four positively charged residues 
in GSDMD-NT or of two cysteine residues disrupts pyroptosis. 

a, Lysates of HEK293T cells, transfected with the indicated plasmids, 

were immunoprecipitated with anti-Flag and analysed by immunoblot 
with the indicated antibodies. The 4A mutant of GIDMD-NT does not 
self-associate in multimers. b, Mutations in other basic residues do not 
affect pyroptosis. The indicated wild-type or mutated Flag-GSDMD-NT 
constructs were transiently expressed in HEK293T cells. Medium was 
collected 18 h after transfection and cell death was measured by CytoTox96 
assay. c, d, Knockdown in immortalized iB MDMs of Gsdmd and ectopic 


75 
eee .. 


expression of wild-type or 4A Gsdmd mRNA (c, assessed by RT-PCR 
relative to GAPDH) and protein (d, relative to tubulin). These data for 
the cells used in the rescue experiment in Fig. 1h show that the ectopic 
proteins are expressed at similar levels as the endogenous protein. 

e, Replacement of Cys37 or Cys192 by Ala in GSDMD-NT disrupts 
oligomerization. Mean +s. d. of three technical replicates and data shown 
are representative of three independent experiments (b, c). Statistical 
differences are calculated by two-tailed t-test (in b, compared to samples 
transfected to express wild-type GIDMD-NT); **P < 0.01 (two-tailed 
t-test). 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 499 , Ml GSDMD-NT GSDMD b 
‘oJ 
3 100 "y477 0.0 
2 qe 80 es 
ao E 
5 § 60 2] 
% 5 40 j 
& Od caaratag : 
2p ae ee 51.1 
0 i i . ; ie ie Ps 
Medium (ul) 0 125 25 50 100 is ee Ge GF Ag 
c 1.29 MME coli [1 S. aureus 
0.55 E. coli 0.55 S. aureus — Untreated = 
0.45 0.45 — GSDMD-NT(100nM) 38 089 
, — GSDMD-NT(200nM) = 
S 0.35 0.35 — GSDMD-NT 4A £2 
S — GSDMD-CT zie oS 
© 0.25 0.25 — GSDMD z 
0.15 0.15 ~ 03 
05 00s | GSDMD-NT (nM) 
0 100 200 300 400 0 100 200 300 400 500 600 GSDMD-NT 4A (nM) 
= : GSDMD-CT (nM) 
imei(mnin) GSDMD (nM) 
120 
d 100 
Untreated GSDMD-NT GSDMD-NT 4A GSDMD-CT GSDMD lsopropanol 3 80 
DS 
oe 
£ 60 
@5 
Syto-9 3% 40 7” 
= 20 
PI 5 
GSDMD-NT + - - - 
GSDMD-NT4A - + - - 
GSDMD-CT - - +. - 
GSDMD - - - + 
e GSDMD-AF488 mCherry Brightfield Merge 
Untreated 
GSDMD 
GSDMD 


+Caspase-11 


GSDMD-CT 


Extended Data Figure 3 | Treatment with GSDMD-NT reduces bacterial 
viability, but does not affect the viability of mammalian cells. 

a, Antibiotic-free culture supernatants (concentrated fivefold) from 
transfected HEK293T cells, collected 30h after transfection, were added 
to iBMDMs, which were cultured at 37°C in 200] final volume for 6h 
before measuring viability by CellTiter-Glo. b, HEK293T cells, transfected 
with Flag-GSDMD-NT 6h earlier, were mixed with an equal number of 
CFSE-labelled untransfected HEK293T cells and incubated for 18h before 
assessing cell death by propidium iodide staining and flow cytometry. 

c, E. coli and S. aureus were untreated or treated with recombinant 
GSDMD, wild-type or 4A-mutant GSDMD-NT, or GIDMD-CT (200 nM 
or indicated concentrations) for 20 min before samples were collected and 
bacterial growth was assessed by monitoring turbidity by optical density 


(representative experiments, left). The time to reach OD¢oo of 0.05 

above background, which is a quantitative measure of the lag in 
detectable growth because of fewer viable bacteria, was defined as 
Tthreshold (tight). The right graph shows the mean + s.d. of three technical 
replicates. d, Bacterial viability after 20 min incubation with indicated 
proteins (200 nM) or isopropanol. Syto-9 enters live and dead bacteria, 
PI only enters dead bacteria (representative images, left; percent live 
cells, right). e, Fluorescence microscopy of mCherry-expressing 

L. monocytogenes incubated with AlexaFluor 488-GSDMD (activated or 
not with caspase-11) or AlexaFluor488-GSDMD-CT for 30 min at 37°C. 
Data shown are representative of results of three independent experiments. 
Statistical differences are relative to untreated samples; **P < 0.01 
(two-tailed t-test). Scale bars, 5 jum. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


doi:10.1038/nature18631 


Genetic dissection of Flaviviridae host factors 
through genome-scale CRISPR screens 


Caleb D. Marceau!*, Andreas S. Puschnik!, Karim Majzoub!, Yaw Shin Ooi!, Susan M. Brewer!, Gabriele Fuchs}, 
Kavya Swaminathan’, Miguel A. Mata!, Joshua E. Elias*, Peter Sarnow! & Jan E. Carette! 


The Flaviviridae are a family of viruses that cause severe human 
diseases. For example, dengue virus (DENV) is a rapidly emerging 
pathogen causing an estimated 100 million symptomatic infections 
annually worldwide!. No approved antivirals are available to 
date and clinical trials with a tetravalent dengue vaccine showed 
disappointingly low protection rates”. Hepatitis C virus (HCV) also 
remains a major medical problem, with 160 million chronically 
infected patients worldwide and only expensive treatments 
available*. Despite distinct differences in their pathogenesis and 
modes of transmission, the two viruses share common replication 
strategies*. A detailed understanding of the host functions that 
determine viral infection is lacking. Here we use a pooled CRISPR 
genetic screening strategy” to comprehensively dissect host factors 
required for these two highly important Flaviviridae members. 
For DENV, we identified endoplasmic-reticulum (ER)-associated 
multi-protein complexes involved in signal sequence recognition, 
N-linked glycosylation and ER-associated degradation. DENV 
replication was nearly completely abrogated in cells deficient.in 
the oligosaccharyltransferase (OST) complex. Mechanistic studies 
pinpointed viral RNA replication and not entry or translation as 
the crucial step requiring the OST complex. Moreover, we show 
that viral non-structural proteins bind to the OST complex. The 
identified ER-associated protein complexes were also important for 
infection by other mosquito-borne flaviviruses including Zika virus, 
an emerging pathogen causing severe birth defects’. By contrast, the 
most significant genes identified in the HCV screen were distinct 
and included viral receptors, RNA-binding proteins and enzymes 
involved in metabolism. We found an unexpected link between 
intracellular flavin adenine dinucleotide (FAD) levels and HCV 
replication. This study shows notable divergence in host-depenency 
factors between DENV and HCV, and illuminates new host targets 
for antiviral therapy. 

CRISPR is revolutionizing the use of genetic screens because the ability 
to completely knockout genes substantially increases the robustness of 
the phenotypes®®. We compared the CRISPR approach in the hepato- 
cyte cell line Huh7.5.1 with an alternative method to generate knock- 
out alleles on a genome-wide scale: insertional mutagenesis in human 
haploid.cells (HAP 1)*” (Fig. 1a). Both methods generate libraries of cells 
with knockout mutations in all non-essential genes. To comprehensively 
identify cellular genes with crucial roles in the Flaviviridae life cycles, 
we first infected pools of mutagenized cells with DENV serotype 2 
(DENV-2). The two types of genetic screening methods showed a strong 
concordance in the genes enriched in the DENV-2-resistant population. 
Many could be functionally classified into three distinct categories, 
each important for proper expression of ER-targeted glycoproteins 
(Fig. 1b, c, Supplementary Tables 1, 2). The translocon associated 
protein (TRAP) complex (containing subunits SSR1, SSR2 and SSR3) 
has an elusive role in stimulating co-translational translocation 


mediated by several, but not all, signal sequences!” (Fig. 1b, c, blue). 
Genes involved in protein quality control and the ER-associated protein 
degradation (ERAD) pathway also scored highly (Fig. 1b, c, green). 
Notably, in both the haploid and CRISPR genetic screens, the most sig- 
nificantly enriched genes were subunits of the OST complex, an enzyme 
essential for N-linked glycosylation (Fig. 1b, c, red). This dependence 
on ER cellular genes is probably related'to the expression of the DENV 
genome, which encodes an ER-targeted viral polyprotein containing 
signal sequences and viral glycoproteins. Given the similarities in 
DENV and HCV polyprotein expression, we expected these genes to 
also be represented in the HCV CRISPR screen. Surprisingly, there 
was no overlap between the DENV and HCV core sets of enriched 
genes, suggesting that these members of the Flaviviridae evolved diver- 
gent host factor dependencies (Fig. 1c-e, Extended Data Fig. 1a, b, 
Supplementary Tables 3, 4). Indeed, cross-comparison of the most 
significant hits with both viruses suggested specific dependencies, 
although minor quantitative effects cannot be excluded (Extended 
Data Fig. 1c). The robustness of the CRISPR approach was further 
underscored by the consistent identification of the core dependency 
factors in three independent replicate screens performed for each virus 
(Extended Data Fig. 2). We validated the novel DENV host factors in 
isogenic knockout cells using a plaque-forming assay and observed a 
marked reduction in particle formation (Extended Data Figs 3, 4a). 
Importantly, complementation of knockout cells restored DENV infec- 
tion (Extended Data Fig. 4b, c). The relevance of the identified host 
factors was further confirmed in Raji DC-SIGN, a B-cell line commonly 
used to study DENV (Extended Data Fig. 4d). 

Struck by the distinct host factor requirements of DENV-2 and HCV, 
we sought to evaluate selected DENV-2-dependency factors against 
other mosquito-borne flaviviruses that are closely related to DENV 
(Fig. 2a). Using quantitative PCR (qPCR) in isogenic knockout cells, 
we found that West Nile virus (WNV), but not yellow fever virus (YFV) 
or Zika virus (ZIKV), was as sensitive as DENV-2 to the disruption of 
the tested ERAD genes, which is in line with previous reports impli- 
cating ERAD in WNV infection!””. A functional TRAP complex is 
important for DENV-2, YFV and ZIKV RNA replication, whereas 
WNV RNA abundance is only slightly reduced. Individual subunits of 
the OST complex displayed notably different phenotypes for the four 
related flaviviruses. Whereas knockout of STT3A and STT3B both com- 
pletely abolished DENV-2 replication, only STT3A knockout affected 
YFV, WNV and ZIKV replication. When probing HCV replication in 
STT3A- and STT3B-knockout Huh7 cells using luciferase virus, we did 
not observe a substantial decrease (Extended Data Fig. 4e). 

Intrigued by the differential sensitivity to the catalytic OST subunits, 
we focused our mechanistic studies on the OST complex, which has not 
been linked to viral replication before. The highly conserved catalytic 
subunit of the OST complex, STT3, is duplicated into two paralogues 
STT3A and STT3B in mammalian cells, and each isoform is present 


1Stanford University, Department of Microbiology and Immunology, Stanford, California 94305, USA. 2Stanford University, Department of Chemical and Systems Biology, Stanford, California 


94305, USA. 
*These authors contributed equally to this work. 


00 MONTH 2016 | VOL 000 | NATURE | 1 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


CRISPR Haploid 


1. Genome-wide KO mutagenesis 


DENV host factors (haploid screen) 


Huh7.5.1-Cas9 HAP1 STT3B 
BY “i 300 STT3A 
@ e@ 
ange ee 3” @ 
z oer y war 3 RPN2 RPS25 
Lentivirus pool $ 200 ne vMPT 
with guide RNAs Gene-trap a SSR2 — sspR3 
% 3°) ss @ @ i era 
Oy Oy ZT 100 UBE2J1 @ 
eye me g KRTCAP2 
2 50 SRP9 AUP? @ ~=(MAGTI YIFIB T7104 
; : ae ; & srry @ OER? “@ Zz Ti 
2. Phenotypic selection by virus infection 2 10d yg e A ae vant 6, SLO35B2 HTRA2 ‘g QKIAA1715 
& 4] °. orca, SH ¢ - @ PEXT Piolo ° 0 
NG a a Korg a . : m4 B3GALT6 oe TyEMaIB EIFAB 
® ® ] 
wa war ne 
DENV or ‘ 
HCV DENV 0.001 ° ol : 
ee ee ER ERAD Oligosaccharyl- Heparan sulfate Other host 
7? ss @ :ansiocation @ pathway @ansterase @ biosynthesis fe) factors 
Oar Oar 
‘e ‘eo 
wer weer ae DENV host factors (CRISPR screen) 
ie MAGTI 
| | ° @ srs 
v v @ 40 O° 
3. NGS of selected and unselected cells 8 aie 
—-s= == ——);, == © ° SSR2 ene 
= == = => @ 301 sso @ His @ STT3A 
& @ ssa © ies osc @ 
8 2.5 @ Emce (muct1) —Mca @ 
e DENV CRISPR screen an @ Bens 
° g eMct osT4 — HSPAI3 
44e @ DENV host factors 5 aa era yy 
e 1 DAG 
@ HCV host factors = LEPRO Te (e) ASGC2 NRSN1 
a D 19 {e) 
@ a 1 
5 37 3 
3 lg 0.5 
oc 
wo’ 0 
x 8 Oligosaccharyl- Other host 
3 ar @ ERiranslocation @ERAD pathway 6... @ sactors 
214 
e 
Se e d 
G Gere Seee 
04 en) 5.0 HCV host factors (CRISPR screen) 
7 ; t ; 
5,000 10,000 15,000 20,000 4.5 
CLDNI 
Gene rank oe ELAVL1 
@ 4.04 cpar OCLN ren 
HCV CRISPR screen 8 4.) 2 @ 
44 3 @ DENV host factors & RFK FLAD1 SRED 
6 30 MIR122 @ 
@ HCV host factors = ANKRD49 ~—ZEBT 
378 8 25 SGMS1 ® 
8 ry as 
a 92.0 Hees @ copy TiM28 
fod 5 paGee © SND1 FKBPL Seed 
[Oj & cHo1 cers @ O rorsa re) fo) TMEM248 
5 5 ) 0.0 
Tle a 


& 
Fl 000 00a e eee 


“= 


, T T T 
5,000 10,000 15,000 20,000 
Gene rank 


Figure 1 | Haploid and CRISPR genetic screens identify essential 

host factors of DENV and HCV infections. a, Schematic for genome- 
wide screening approach. NGS, next-generation sequencing. b, Haploid 
genetic screen for DENV host factors. The y axis represents significance 
of enrichment of gene-trap insertions in genes in DENV-resistant 
population compared to.unselected HAP! cells. Each circle represents a 
specific gene and size corresponds to the number of independent gene- 
trap insertions. All genes with P< 0.05 (Fisher’s exact test) were coloured 


in distinct protein complexes!’ (Extended Data Fig. 4f). The STT3A 
complex is important for the co-translational N-linked glycosylation 
of most of the glycoproteins, whereas the STT3B complex is important 
for the co-translational or post-translational glycosylation of acceptor 
sites that have been skipped by the STT3A complex". Despite their par- 
tially redundant functions in N-linked glycosylation, we found that all 
DENV serotypes required the presence of both catalytic subunits as well 
as MAGT|, the highest scoring gene in the CRISPR screen (Fig. 2b). 
To pinpoint which step in the viral life cycle requires the OST 
complex, we first focused on viral entry. We did not observe major 
differences in viral particle entry in OST-deficient cells (Fig. 2c). Next, 
we used a replicon assay that bypasses viral entry by electroporation of 
DENV RNA. Translation of the viral genome, apparent at time points 
up to 10h, was equally efficient in OST-knockout cells as in wild-type 
cells (Fig. 2d). In stark contrast, viral RNA replication (apparent at 
time points after 10h) was completely abolished. This mirrored the 


2 | NATURE | VOL 000 | 00 MONTH 2016 


@HCV entry @MicroRNA 


fe) Other host 
factors 


Transcriptional 
@Enzymes @ regulation 


pathway 


and grouped by function. The screen was performed once. c, d, CRISPR 
genetic screen for DENV (c) and HCV (d) host factors in Huh7.5.1 cells. 
Significance of enrichment was calculated by RIGER analysis. The screens 
were performed in three replicates and the mean of the RIGER score is 
represented on the y axis. The 30 most enriched genes were coloured and 
grouped by function. e, Comparison of the 30 most enriched genes from 
the DENV and HCV CRISPR screens and their position based on the 
mean RIGER score. 


expression pattern observed with a replication-deficient dengue mutant 
in the viral polymerase (NS5°P°). Thus, we show that the OST com- 
plex has a crucial involvement in viral RNA replication, after entry and 
translation of the viral genome. 

Most glycoproteins can be efficiently modified by both OST isoforms 
and only a feware preferentially modified by either STT3A or STT3B4. 
Concordant with this, knockout of either STT3A or STT3B did not 
lead to loss of cellular viability, whereas a double knockout was lethal 
(Fig. 3a). We demonstrated that OST catalytic activity is required for 
cellular function using STT3A and STT3B mutants containing muta- 
tions in the residues that coordinate the binding of the divalent cation 
required for catalysis'® (Fig. 3a, Extended Data Fig. 5). The functional 
redundancy between isoforms in global N-linked glycosylation is in 
contrast with the extreme dependence on each individual isoforms of 
the OST complex for DENV replication. To investigate whether the 
effect of the OST complex is meditated by the necessity to glycosylate 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a ma DENV2 (16681) mm YFV (17D) ma WNV (Kunjin) mm ZIKV (Uganda) d 408 DENV-Luc-replicon 
an 00 teeeeeeeeerees f | Serer OOeeereeeeeereeeerreee eereesy ore’ Gren’ Srreveres Gen eerereersreees (errens 
Ee 
E10 Zz 
2 ® — WT 
= 8 
- 1 5 — STT3A-KO 
3s 3 — STT3B-KO 
o 8 
£ o1] ‘= — WT (DENV GDD) 
g E 
= =) 
= 0.014 3 
g 
> 0.001 T T T T T 7 
STT3A STT3B RPN1 SSR2 SSR3 AUP1 SELIL UBE2J1 ASCC2 RPS25 A036 20. 2909. tave 00: - (60 
Oligosaccharyltransferase TRAP complex ERAD pathway Other hits Time post electroporation (h) 
b O min 30 min 
a m= WT mm STT3A-KO mm STT3B-KO ma MAGTI-KO 
& GA DENV-E Merge DENV-E Merge 
E k 
= 
2 nt 
g 2 
& 
£ 
3 g 
a 2 
= E 
S a 
DENV-1  DENV-2. DENV-3.— DENV-4 


Figure 2 | ER protein complexes have a crucial role in the replication 
phase of DENV and are also important for YFV, WNV and ZIKV 
infection. a, GPCR of DENV (clone 16681), YFV (17D), WNV (Kunjin), 
and ZIKV (Uganda) RNA in knockout HAP1 cells. WT, wild type. b, qPCR 
of prototypic strains of DENV serotypes 1-4 RNA in knockout (KO) Huh7 
cells. c, Confocal microscopy of STT3B-KO Huh7 cells immunostained 


viral proteins properly, we focused on NS1, an enigmatic DENV 
glycoprotein with essential roles in RNA replication'® and 
pathogenesis!’. NS1 was fully glycosylated in STT3A- and STT3B- 
knockout cells in contrast to the hypo-glycosylation of control cellular 
proteins known to be preferentially glycosylated by STT3A (pSAP) or 
STT3B (SHBG) (Fig. 3b). This led us to speculate that the OST complex 
itself rather than its catalytic activity is required for DENV replication. 
To explore this hypothesis, we used the catalytic mutants of STT3A 
and ST'T3B (Fig. 3a, c, Extended Data Fig. 5). Surprisingly, the cata- 
lytically dead mutants were able to restore fully DENV replication in 
STT3A- and in STT3B-knockout cells (Fig. 3d). We thus concluded 
that DENV RNA replication has hijacked a function of the human 
OST complex that is independent from its canonical role in N-linked 
glycosylation. The dispensability of the catalytic function of the OST 
complex further suggests a more structural role of OST in viral repli- 
cation. The OST complex forms a stoichiometric complex at the ER 


for DENV envelope (DENV-E) protein immediately or 30 min after DENV 
infection. Original magnification, x 630. d, Luminescence of DENV 
replicon RNA expressing luciferase in knockout Huh7 cells. The DENV 
NS5°PP mutant served as replication-deficient control. RLU, relative 

light units. Data are mean and’svesm. (qPCR) or s.d. (RLU) for triplicate 
infections. 


membrane where DENV establishes a functional replication complex to 
initiate RNA replication. Our electron microscopy studies using APEX2 
confirmed the localization of STT3B in the ER membrane, in close 
proximity to the membranous replication vesicles in DENV-2-infected 
cells (Extended Data Fig. 6a-c). 

We next interrogated the physical interaction of the viral proteins 
with the OST complex by immunoprecipitation of STT3A—Flag and 
STTB-Flag in the context of viral infection. Western blot analysis after 
immunoprecipitation showed that the non-structural proteins NS2B 
and NS3, components of the DENV replication complex, associate 
specifically with STT3A and STT3B (Fig. 3e, Extended Data Fig. 6d). 
Next, eluates of the immunoprecipitations (Extended Data Fig. 6e, f) 
were subjected to tryptic proteolysis and the resulting peptides were 
analysed by mass spectrometry. Identification of tryptic peptides from 
NS2A, NS3 and NS4A in the eluates pointed to their association with the 
OST complexes formed by STT3A and STT3B (Extended Data Fig. 6g, 


a b c 
STT3A/STT3B _pLenti- o 
double KO CMV-STTSA or B . a ral Huh7 Huh7-STT3A-KO 
(Dox-On STT3B) ah «Ss x & CMV-STT3A:— — cat. WT 
-% Nek § "Xs 
Ay! =» ys wv .S i tes I sap 
> —_— 
» © Dox. {reac Dox. Fad -_ — os: 
2 —_ os 
Assess viability Assess viability de | —| SHBG Huh? Huh7-STT3B-KO_ 
CMV-STTSA = = WT cat == . CMV-STT3B: - - cat WT 
-STT3B:  - Se - WT cat tay vr 
Dox: + - - - - - O> —— PSAk =__. s FFF susc 
= ae 7 ‘< = 
r rs oO 
L ee, ee _ 
& 
PPP SF oP 
d © SMES 
HM No Addback i WT Addback I cat Addback 
40° Input |< ip re nai 
2 108 Flag-IP (LE) 
= 7 
ae Input | ac < 
5 408 S$ | NS2B 
3 405 Flag-IP (LE) | 
e )4 
= 10 Input | ae eo 
3 4108 NS3 
DENV-Luc. - + eae ap aes es Flag-IP (LE) | 
Huh7 Huh7 STT3A-KO Huh7 STT3B-KO +DENV2 Uninfected 


Figure 3 | DENV RNA replication requires a non-canonical function of 
OST, and DENV non-structural proteins interact with OST. a, Viability 
of STT3A and STT3B double-knockout cells complemented with wild-type 
(WT) or catalytic (cat) mutant cDNA. Dox, doxycycline. b, Glycosylation 
of DENV protein NS1, SHBG and pSAP in STT3A- and STT3B-knockout 
Huh7 cells. Different glycoforms are indicated by arrowheads. Tun., 


tunicamycin. c, Glycosylation state of psAP and SHBG in STT3A- and 
STT3B-knockout cells complemented with catalytic mutants. d, DENV 
infection of knockout Huh7 cells complemented with wild-type or 
catalytic mutants of STT3A and STT3B. Data are mean and s.d. of 
triplicate infections. e, Co-immunoprecipitations of STT3A-Flag and 
STT3B-Flag from DENV-infected cell lysates. LE, long exposure. 


00 MONTH 2016 | VOL 000 | NATURE | 3 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


9 
os 


Viral RNA (relative to WT) (%) 
3 
tiol 


0.1 
Ep wet o gy LP We 
XS ae Aes os CK 


KO 


© 


Viral RNA 
(relative to untreated) (%) 


0 20 40 60 
Lumiflavin concentration (uM) 


80 100 


— Untreated 


— +lumiflavin 


Luminescence 
(normalized RLU) 


0 12 24 36 48 60 72 

Time post electroporation (h) 
Figure 4 | FAD biosynthesis is required for HCV replication and can 
serve as antiviral target. a, GPCR of HCV RNA in Huh7.5.1 cell lines. 
b, HCV particle formation measured by focus-forming units (FFU) 
assay. ND, no foci detected (threshold of detection is 50 FFU ml7!). 
c, Biosynthesis pathway of FAD. Lumiflavin (LF) competitively inhibits 
uptake of riboflavin. Vit.B2, vitamin B2. d, qPCR of HCV RNA in 
untreated, FMN- or FAD-treated RFK- and FLAD1-knockout Huh7.5.1 
cells. e, GPCR of DENV or HCV RNA in lumiflavin-treated Huh7.5.1 


Supplementary Table 5). Taken together, this suggests a structural role 
for the OST complex in DENV RNA replication through interactions 
with non-structural proteins that form the RNA replication complex. 
Our data indicate that the OST complex fulfils specialized roles in host 
pathogen interactions and is more multifaceted than previously recog- 
nized. In support of this emerging notion, two recent studies uncovered 
unexpected roles of the OST complex in immunity. The OST com- 
plex was found to be crucial for innate immune responses triggered 
by lipopolysaccharide'’, and in a separate study OST dysregulation 
was identified as cause for autoimmune disorders triggered by certain 
TREX1 mutations”. 

The HCV-resistant cell population was highly enriched in guide 
RNAs targeting the Known HCV receptors”? CD81, OCLN and CLDN1, 
confirming their non-redundant role in entry for HCV (Fig. 1d, 
blue) and highlighting the validity of the screen results. The complete- 
ness of the screen was further underscored by the identification of 
microRNA-122 (ref. 21) and DGCR&8 (Fig. 1d, green), which is part of 
the microRNA processing machinery, as key factors for HCV replica- 
tion. Several dependency factors of HCV were validated in Huh7.5.1 
cells, where knockout significantly reduced viral RNA and particle for- 
mation (Fig. 4a, b, Extended Data Figs 7, 8a). After CLDN1, the second 
most significantly enriched gene was ELAVL1 (also known as HUR), 
an RNA-binding protein involved in mRNA stabilization”. In isogenic 
ELAVL1-knockout cells, HCV RNA replication was nearly abolished, 
while we did not observe a decrease with other RNA viruses (Extended 
Data Fig. 8b, c) including the alphavirus Sindbis, which contains strong 
ELAVL1-binding sites”. We used an HCV replicon assay to show that 
ELAVLI has a critical role in HCV RNA replication, which is in line 
with a recent report” (Extended Data Fig. 8d, e). Enzymatically active 
HCV-dependency factors (Fig. 1d, red) are by far the most important 
category of host factors for identifying antiviral drug targets. A case 
in point is cyclophilin A (PPIA) that has been actively pursued until 


4 | NATURE | VOL 000 | 00 MONTH 2016 


ny 


c d 

Riboflavin (vit.B2) = 

* Extracellular] pee LF = 
Intracellular = 

Riboflavin (vit.B2) ge 

Lo 

RFK oe 

ast 

2 

FMN = 

FLAD1 Treatment: Se SS 
se FAD WT RFK-KO FLAD1-KO 


Heparan- 
sulfate 


Nucleus 


cells. For each concentration, the significance of the effect on HCV 

versus DENV was determined. f, HCV replicon assay in untreated and 
lumiflavin-treated Huh7.5.1 cells using wild-type sgJFH1 replicon. 

g, Model of identified DENV and HCV host factors. Data are mean and 
s.e.m. (qPCR) or s.d. (FFU, RLU) for triplicate infections. *P < 0.05, 
**P< 0.01, ***P < 0.001 (unpaired, parametric two-sided Student's t-test, 
with Welch post-correction). NS, non-significant. 


phase III clinical trials*°. We discovered enzymes with novel putative 
roles in HCV replication and explored these potentially ‘druggable 
factors further. We focused on RFK and FLAD1, enzymes involved in 
the two-step conversion of riboflavin (vitamin B2) to flavin adenine 
dinucleotide (FAD) (Fig. 4c). RFK- and FLAD1-knockout cells were 
resistant to HCV replication but not DENV (Extended Data Fig. 9a). 
As predicted from their sequential role in FAD biogenesis, exogenous 
flavin mononucleotide (FMN) or FAD rescued HCV replication in 
RFK-knockout cells, whereas FAD but not FMN rescued viral replica- 
tion in FLAD1-knockout cells (Fig. 4d, Extended Data Fig. 9b). This 
demonstrates that HCV replication is dependent solely on sufficient 
FAD levels. Modulation of intracellular FAD levels can be achieved 
by treatment of the cells with lumiflavin, a cellular uptake inhibitor of 
riboflavin’®. Treatment of cells with lumiflavin greatly reduced HCV 
replication, while other RNA viruses were less sensitive to lumiflavin 
treatment (Fig. 4e, Extended Data Fig. 9b-f). We further pinpointed 
RNA replication as the step of the life cycle that requires FAD using 
a replicon system (Fig. 4f). This highlights that knockout screens 
can identify specific host targets for antiviral drug discovery. Taken 
together, we used comparative genome-scale knockout screens to iden- 
tify human genes with crucial roles in the replication of Flaviviridae. 
Despite previous extensive interrogation of human host factors for 
these viruses through genomic and proteomic approaches!!””-7°, we 
discovered marked dependencies on several host processes that had 
not been linked to flaviviral replication before (Fig. 4g, Extended Data 
Fig. 10). 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 24 January; accepted 10 June 2016. 
Published online 17 June 2016. 


© 2016 Macmillan Publishers Limited. All rights reserved 


22. 
23. 


Bhatt, S. et al. The global distribution and burden of dengue. Nature 496, 
504-507 (2013). 

Thomas, S. J. & Rothman, A. L. Trials and tribulations on the path to developing 
a dengue vaccine. Am. J. Prev. Med. 49 (suppl. 4), S334-S344 (2015). 
Lavanchy, D. Evolving epidemiology of hepatitis C virus. Clin. Microbiol. Infect. 
17, 107-115 (2011). 

Paul, D. & Bartenschlager, R. Flaviviridae replication organelles: oh, what a 
tangled web we weave. Annu. Rev. Virol. 2, 289-310 (2015). 

Shalem, O. et al. Genome-scale CRISPR-Cas9 knockout screening in human 
cells. Science 343, 84-87 (2014). 

Wang, T., Wei, J. J., Sabatini, D. M. & Lander, E. S. Genetic screens in human 
cells using the CRISPR-Cas9 system. Science 343, 80-84 (2014). 

Rasmussen, S. A., Jamieson, D. J., Honein, M. A. & Petersen, L. R. Zika virus and 
birth defects—reviewing the evidence for causality. N. Engl. J. Med. 374, 
1981-1987 (2016). 

Carette, J. E. et al. Haploid genetic screens in human cells identify host factors 
used by pathogens. Science 326, 1231-1235 (2009). 

Carette, J. E. et al. Ebola virus entry requires the cholesterol transporter 
iemann-Pick C1. Nature 477, 340-343 (2011). 


. Fons, R. D., Bogert, B. A. & Hegde, R. S. Substrate-specific function of the 


ranslocon-associated protein complex during translocation across the ER 
membrane. J. Cell Biol. 160, 529-539 (2003). 


. Krishnan, M. N. et al. RNA interference screen for human genes associated 


with West Nile virus infection. Nature 455, 242-245 (2008). 
a, H. et al. A CRISPR-based screen identifies genes essential for West- 
ile-Virus-induced cell death. Cell Rep. 12, 673-683 (2015). 


. Shrimal, S., Cherepanova, N. A. & Gilmore, R. Cotranslational and 


posttranslocational N-glycosylation of proteins in the endoplasmic reticulum. 
Semin. Cell Dev. Biol. 41, 71-78 (2015). 


. Ruiz-Canada, C., Kelleher, D. J. & Gilmore, R. Cotranslational and 


posttranslational N-glycosylation of polypeptides by distinct mammalian OST 
isoforms. Cell 136, 272-283 (2009). 


. Lizak, C., Gerber, S., Numao, S., Aebi, M. & Locher, K. P. X-ray structure of a 


bacterial oligosaccharyltransferase. Nature 474, 350-355 (2011). 


. Lindenbach, B. D. & Rice, C. M. trans-Complementation of yellow fever virus 


S1 reveals a role in early RNA replication. J. Virol. 71, 9608-9617 (1997). 


. Beatty, P. R. et a/. Dengue virus NS1 triggers endothelial permeability and 


vascular leak that is prevented by NS1 vaccination. Sci. Transl. Med. 7, 
304ra141 (2015). 


. Parnas, O. et al. A Genome-wide CRISPR screen in primary immune cells to 


dissect regulatory networks. Cell 162, 675-686 (2015). 


. Hasan, M. et al. Cytosolic nuclease TREX1 regulates oligosaccharyltransferase 


activity independent of nuclease activity to suppress immune activation. 
Immunity 43, 463-474 (2015). 


. Scheel, T. K. & Rice, C. M. Understanding the hepatitis C virus life cycle paves 


the way for highly effective therapies. Nat. Med. 19, 837-849 (2013). 


. Jopling, C.L., Yi, M., Lancaster, A. M., Lemon, S. M. & Sarnow, P. Modulation of 


hepatitis C virus RNA abundance by a liver-specific microRNA. Science 309, 
1577-1581 (2005). 

Brennan, C. M. & Steitz, J. A. HuR and mRNA stability. Cell. MolLife Sci. 58, 
266-277 (2001). 

Sokoloski, K. J. et a/. Sindbis virus usurps the cellular HuR protein to stabilize 
its transcripts and promote productive infections in mammalian and mosquito 
cells. Cell Host Microbe 8, 196-207 (2010). 


LETTER 


24. Shwetha, S. et al. HuR displaces polypyrimidine tract binding protein to 
facilitate la binding to the 3’ untranslated region and enhances hepatitis C 
virus replication. J. Virol. 89, 11356-11371 (2015). 

25. Lin, K. & Gallay, P. Curing a viral infection by targeting the host: the example of 
cyclophilin inhibitors. Antiviral Res. 99, 68-77 (2013). 

26. Fujimura, M. et al. Functional characteristics of the human ortholog of 
riboflavin transporter 2 and riboflavin-responsive expression of its rat ortholog 
in the small intestine indicate its involvement in riboflavin absorption. J. Nutr. 
140, 1722-1727 (2010). 

27. Sessions, O. M. et al. Discovery of insect and human dengue virus host factors. 
Nature 458, 1047-1050 (2009). 

28. Li, Q. et al. A genome-wide genetic screen for host factors required for 
hepatitis C virus propagation. Proc. Natl Acad. Sci. USA 106, 16410-16415 
(2009). 

29. Tai, A. W. et al. A functional genomic screen identifies cellular cofactors of 
hepatitis C virus replication. Cell Host Microbe 5, 298-307 (2009). 

30. Ramage, H. R. et a/. A combined proteomics/genomics approach links hepatitis 
C virus infection with nonsense-mediated mRNA decay. Mol. Cell 57, 329-340 
(2015). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements The authors thank T. Brummelkamp, J. Idoyaga and 

S. Einav for critically reading the manuscript and discussions; X. Ji and the 
Stanford Functional Genomics Facility; The Stanford Transmission Electron 
Microscope Facility; The Stanford Shared FACS Facility, and members of 

the Carette laboratory for discussions and support. S. Einav, K. Kirkegaard, 

F. Chisari, J. F. Anderson, C. Rice, H. Ploegh, R. Kopito, E. Harris, M. lvessa, 

S. Weaver, R. Tesh and E. Campeau are acknowledged for providing materials. 
The work was funded in part by National Institutes of Health (NIH) DP2 
Al104557 (J.E:C.), NIH Al109662 (J.E.C.), David and Lucile Packard Foundation 
(J.E.C.), Stanford Graduate Fellowship (A.S.P.), Boehringer Ingelheim Fonds 
(A.S.P.) and NSF- GFRP (C.D.M.). 


Author Contributions C.D.M., A.S.P. and J.E.C. were responsible for overall 
design of the study. Most of the experiments related to the DENV and HCV 
genetic screens were performed by C.D.M. and A.S.P., respectively. K.M. 
designed and performed several validation experiments and J.E.C. designed 
dengue constructs. Y.S.O. performed one of the DENV screens. PS. provided 
expertise indesign of HCV experiments and M.A.M. and S.M.B. performed 
several HCV experiments. Mass spectrometry experiments were performed and 
analysed by C.D.M., G.F. and K.S. under the technical expertise of J.E.E. C.D.M., 
A.S.P.,, K.M. and J.E.C. wrote the manuscript with input from all authors. 


Author Information The CRISPR and haploid genetic screens have been 
deposited in the NCBI BioProject database under accession numbers 
PRJNA322191 and PRJNA284536, respectively. Reprints and permissions 
information is available at www.nature.com/reprints. The authors declare no 
competing financial interests. Readers are welcome to comment on the online 
version of the paper. Correspondence and requests for materials should be 
addressed to J.E.C. (carette@stanford.edu). 


Reviewer Information Nature thanks W. Wei and the other anonymous 
reviewer(s) for their contribution to the peer review of this work. 


00 MONTH 2016 | VOL 000 | NATURE | 5 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


METHODS 


No statistical methods were used to predetermine sample size. The experiments 
were not randomized and investigators were not blinded to allocation during 
experiments and outcome assessment. 

Haploid genetic screen. The haploid genetic screen was performed as previously 
described’. In short, 100 million gene-trap mutagenized HAP! cells were seeded 
and infected with DENV-2 16681 (multiplicity of infection (MOI) = 5). Eight hours 
after infection, media was aspirated and replaced with IMDM containing 2% FBS. 
Clear cytopathic effects were observed after 3 days of infection leading to the death 
of most cells. Clusters of cells resistant to DENV-2 infection became apparent dur- 
ing further culture, and 9 days after infection cells were collected as a pool (yield 
~30 million cells) and genomic DNA was isolated using a QIAamp DNA column. 
Gene-trap insertion sites were determined by linear amplification of the genomic 
DNA (gDNA) flanking regions of the gene-trap DNA insertion sites and sequenced 
on a Genome Analyzer II. Reads were aligned to the human genome using Bowtie 
and enrichment of independent insertions was calculated as previously described’. 
CRISPR genetic screens. Huh7.5.1 cells were stably transduced with lenti- 
Cas9-Blast and subsequently selected using blasticidin. Next, a total of 300 million 
Huh7.5.1 cells that constitutively express Cas9 were transduced with the len- 
tiGuide-Puro from the GeCKO v2 library*! at a MOT of 0.3. Cells were selected 
with puromycin and pooled together. The CRISPR genetic screens were started 
10 days after transduction. Approximately 60 million mutagenized cells for each 
library (A and B) were infected with DENV-2 16681 (replicate 1 and 2) or DENV-2 
strain 429557 (replicate 3) using a MOI of 1, or with HCV JFH1 at a MOI of 0.3. 
Cytopathic effect was visible 2 and 5 days after infection for DENV and HCV, 
respectively. Huh7.5.1 cells grow slower than HAP1 cells and clusters of resistant 
cells took longer to develop. The selected cells were collected 16 days after infection. 
As an uninfected reference we chose the unselected starting population because in 
these strong positive selection screens the selection pressure of the viral infections 
renders potential small growth differences of the mutagenized cells inconsequen- 
tial. For both selected and uninfected control cells, g NA was isolated using a 
QIAamp DNA column, and the inserted guide RNA sequences were amplified 
from the gDNA by flanking primers and prepared for next-generation sequencing. 
Resulting amplicons were sequenced on a MiSeq or NextSeq platform (Illumina) 
and the enrichment of each guide RNA was calculated by comparing the rela- 
tive abundance in the selected and unselected population. RIGER analysis was 
performed on guide RNAs (with at least 10 reads) ranked by enrichment using the 
weighted sum statistical method**. Each CRISPR screen was performed in three 
replicates and the mean of the three RIGER scores was calculated. 

Cell culture. HAP1 cells were derived from the near-haploid chronic. myeloid.leu- 
kaemia cell line KBM7 as described earlier’. HAP1 cells and knockout derivatives 
were cultured in IMDM supplemented with 10% FBS, penicillin-streptomycin and 
L-glutamine. STT3A/STT3B double-knockout HAP! cells were cultured in IMDM 
supplemented with 10% FBS, penicillin-streptomycin, L-glutamine and 25ng ml! 
doxycycline. Huh7, Huh7.5.1 (both gifts from FE Chisari) and HEK293FT (Thermo 
Scientific) cells and knockout derivatives were grown in DMEM supplemented 
with 10% FBS, penicillin-streptomycin, non-essential amino acids and 
L-glutamine. HEK293FT cells were used to generate lentivirus vectors for cel- 
lular transductions. Raji DC-SIGN cells (a gift from E. Harris) were cultured in 
RPMI-1640 supplemented with 10% FBS, penicillin-streptomycin and 
L-glutamine. The cell lines have not been authenticated. Parental cell lines have 
been tested negative for mycoplasma. 

Viral serotypes and strains. DENV-2 infectious clone 16681 was a gift from K. 
Kirkegaard. DENV-2 from infectious clone 16681 was adapted to HAP1 cells 
through serial passaging. Viral whole-genome sequence analysis revealed three 
coding mutations compared.to the original clone 16681: Q399H in envelope, 
L180F in NS2A and $238F in NS4B. DENV-1 Hawaii 1944 (#NR82), DENV-2 
strain 429557 (#NR-12216), DENV-2 New Guinea C 1944 (4NR-84), DENV-3 
Philippines/H871856 (#NR-80) and DENV-4 H241 Philippines 1956 (#NR-86) 
were ordered from BEI resources (NIH, NIAID). Yellow fever virus was generated 
by culturing the yellow fever YF-VAX 17D-204 vaccine. West Nile virus (Kunjin 
strain CH 16532) was a gift from J. E Anderson. Zika virus (strain MR766) was 
provided by S. Weaver and R. Tesh. Hepatitis C virus JFH1 and HCV-Luc pFL- 
J6/JFH-5'C19Rluc2AUbi vector were gifts from C. Rice. Sindbis virus (SINV) 
strain Ar-339 (TC adapted) Egypt 1952 (ATCC VR-1585) and human rhinovirus 
14 (ATCC VR-284) were ordered from the American Type Culture Collection. 
Poliovirus type 1 strain Mahoney was a gift from H. Ploegh. Venezuelan equine 
encephalitis virus TC-83 (pVEEV/GEP) was a gift from I. Frolov. 

qPCR infectivity assays. Cells were plated in 96-well plates and infected with an 
MOI of 0.1 of virus, unless otherwise stated. Cells were collected as outlined in 
Ambion Power SYBR Green Cells-to-Ct kit (Ambion 4402954). Cells were col- 
lected 8h after infection with polio virus, 24h after infection with Sindbis virus, 


Venezuelan equine encephalitis virus and human rhinovirus 14, 2 days after infec- 
tion with DENV-2 16681 and Zika virus, yellow fever virus, and West Nile virus 
Kunjin strain, 3 days after infection with HCV JFH1 and 5 days after infection with 
DENV-2 New Guinea. For comparison of DENV serotypes, cells were infected at 
an MOI of 0.01 with DENV-1 Hawaii 1944, DENV-2 New Guinea C 1944, DENV-3 
Philippines/H871856 and DENV-4 H241 Philippines 1956 for 2 days. All samples 
were normalized to 18S expression. Two independent experiments were performed 
with triplicate infections and one representative is shown. 

The following qPCR primers were used: DENV-2-forward: 5'-GCCCTTCT 
GTTCACACCAT-3’, reverse: 5’-GGCTCTGCCAATCAGTTCAT- 3’; universal- 
DENV-forward: 5/-GGTTAGAGGAGACCCCTCCC-3’, reverse: 5’-GGTCTCCT 
CTAACCTCTAGTCC-3’; yellow fever forward: 5’-GAAATGGETGCCC 
TTTATGA-3’, reverse: 5’-GCACATGGCAACAGAAGCTA-3/; Kunjin-forward: 
5'’-GCTTTGCCACCTCTCTTCAC-3’, reverse: 5’-GEGGTTGATGGTTTCC 
ACTCT-3’; ZIKV-forward: 5’/-ACCATACGGCCAACAAAGAG=3’, reverse: 
5’-TCCACAGCCAGGAAGAGACT-3’; HCV-forward: 5/-TCTCTCAGTCC 
TTCCTCGGA-3’, reverse: 5/-AAGCCGGCTAGAGTCTTGTT-3/; SINV-forward: 
5'-CGCGGTCACGTAAGGATAAT-3’, reverse: 5'-TTTGGCATTCTTCAGC 
ACAG-3’; polio-forward: 5’-CAACCTCCCACTGGTGACTT-3’, 
reverse: 5’/-ATTTCCCCTGCTCAACCTTT-3/; 18S-forward: 5’-AGAAAC 
GGCTACCACATCCA-3’, reverse: 5’-CACCAGACTTGCCCTCCA-3’; 
VEEV-forward: 5’-CAGGACGATCTCATTCTCAC-3’, reverse: 5/-TCATTCA 
CCTTGTACCGAACG-3/; HRV- 14-forward: 5’-AAGCAATTTGGTGGTCC 
AAG-3’, reverse: 5'/-ACACTGGGGTTTGAAGCACT-3’. 

Crystal violet staining. For virus infections, wild-type and knockout cell lines were 
plated out in either 24- or 96-well plates. Cells were infected with DENV-2 (16681), 
HCY, SINV or polio using a MOI of 1. Huh7-STT3A-KO-STT3B-KO-pLenti- 
TRE3G-CMV-STT3B cells were cultured in presence or absence of 25 ng ml! 
doxycycline. Cells were incubated for 48-120h then fixed using 4% formaldehyde 
in PBS. Cell viability at time of fixation was determined by crystal violet staining. 
Plaque-forming units assay. Plaque assays were performed on BHK-21 cells. In 
brief, BHK-21 monolayers were grown to 80% confluency in 24-well plates and 
incubated for 1h at 37 °C in 5% CO) with serially diluted virus supernatants from 
wild-type andmutant HAP1 cells infected with DENV, at a MOI of 0.1 for 48h. 
The wells were then overlaid with DMEM, 0.8% Aquacide II (EMD Millipore), 
and 10% FBS, incubated for 7 days, and fixed with 10% formaldehyde. The cells 
were then stained overnight with crystal violet. The next day the wells were exten- 
sively washed with water then dried, and the resulting plaques were counted and 
plaque-forming units per ml were calculated. Two independent experiments were 
performed with triplicate infections and one representative is shown. 
Focus-forming units assay. Wild-type and knockout Huh7.5.1 cells were plated 
in 24-well plates and infected with HCV at a MOI of 0.1. Three days after infec- 
tion, supernatant was collected and added to wild-type Huh7.5.1 cells in a tenfold 
dilution series. After 3 days, cells were fixed, stained with mouse-anti- HCV-core 
(Abcam ab2740) and anti-mouse-IgG-Alexa-488 (Life Technologies) and fluores- 
cent colonies were counted. Two independent experiments were performed with 
triplicate infections and one representative experiment is shown. 

Luciferase reporter virus assays. Cells were plated out in 96-well plates in tripli- 
cates and infected with dengue luciferase reporter virus at an MOI of 0.01. Cells 
were incubated with dengue luciferase reporter virus at 37°C, 5% CO, and cell 
lysates were collected. Luciferase expression was measured using Renilla Luciferase 
Assay system (Promega E2820). Cells were lysed using Renilla lysis buffer and lucif- 
erase activity measured by addition of substrate and immediate luciferase readings 
were taken using Glomax 20/20 luminometer using a 10-s integration time. For 
the cross-comparison of the effects of host factors on DENV and HCV Huh7.5.1 
knockout cell lines were infected with dengue luciferase reporter virus at an MOI 
of 0.01 or with HCV luciferase virus at an MOI of 0.2. For the validation of HCV 
host factors, four different knockout cell lines (created using lentiCRISPRv2) were 
infected with HCV luciferase virus at an MOI of 0.2. Two independent experiments 
were performed with triplicate infections and one representative experiment is 
shown, with the exception of the experiment shown in Extended Data Fig. 4e, 
which was performed once with triplicate infections. 

Infection of Raji DC-SIGN cells. Raji DC-SIGN host factor knockout cell lines 
were created by transduction of lentiCRISPRv2 and subsequent puromycin selec- 
tion. Resulting cell lines were infected with dengue luciferase virus at an MOI of 
0.05 and collected 3 days after infection. Three independent experiments were 
performed with triplicate infections and one representative is shown. 
Internalization assay and confocal microscopy. Approximately 10,000 Huh7 cells 
were seeded on Laboratory-TekII Chamber slides (Thermo Fisher Scientific). The 
next day, cells were incubated on ice for 15 min, infected with DENV (MOI=60) 
and incubated on ice for 1h. Cells were washed three times with ice-cold PBS and 
subsequently incubated at 37°C for 0 or 30 min. At each time point, 10g m1! 


© 2016 Macmillan Publishers Limited. All rights reserved 


wheat germ agglutinin-Alexa-594 (Life Technologies, W11262) was added for 
10 min at room temperature before three washes with PBS and fixation with 4% 
paraformaldehyde. Dengue virus was stained with a rabbit-anti-dengue-envelope 
antibody (Genetex, 127277) in block/perm buffer (1% saponin, 1% Triton X-100, 
5% FBS) for 1h followed by incubation with goat anti-rabbit-IgG-Alexa-488 (Life 
Technologies, A-11008) and DAPI (Insitus, F203) for 30 min. After three washes 
with PBS, cells were visualized using confocal microscopy. 

Replicon assays. Dengue replicon plasmid was linearized using XbaI restric- 
tion enzyme. Replicon RNA was generated using the MEGAscript T7 High 
Yield Transcription Kit (Ambion, AM1334) with the reaction containing 
5mM m/’G(5')ppp(5’)G RNA Cap Structure Analogue (NEB, $1405S). Resulting 
RNA was purified by sodium acetate ethanol precipitation. HCV sgJFH1 replicon* 
RNA was prepared as described for DENV with the exception of adding the cap 
structure analogue. Cells were washed twice with PBS and re-suspended in electro- 
poration buffer (Teknova, E0399). Three micrograms of purified replicon RNA was 
mixed with cells, and cells were electroporated using Bio-Rad Gene Pulser Xcell 
electroporator using square wave protocol. Electroporated cells were resuspended 
in cell culture medium without antibiotics and plated into 24-well plates. Luciferase 
expression was measured using Renilla Luciferase Assay system (Promega, E2820). 
Cells were lysed using Renilla lysis buffer and luciferase activity measured by addi- 
tion of substrate and luciferase readings were taken immediately using Glomax 
20/20 luminometer using a 10-s integration time. For lumiflavin treatment, cells 
were electroporated with 2 1g of viral RNA and 1 1g of firefly mRNA (Trilink) 
to normalize for effects on cell proliferation. For the lumiflavin treatment, two 
independent experiments with three electroporations each were performed. One 
representative experiment is shown. For the replicon assay in ELAVL1-knockout 
cells, three independent experiments with a single electroporation were performed. 
The average of the experiments is shown. For the DENV-replicon assay, three inde- 
pendent experiments were performed. One representative experiment was shown. 
Immunoblot analysis. Cell pellets were lysed using Laemmli SDS sample buffer 
containing 5% 3-mercaptoethanol and boiled for 10 min. Lysates were separated by 
SDS-PAGE on pre-cast Bio-Rad 4-15% poly-acrylamide gels in Bio-Rad minipro- 
tean gel system. Proteins were transferred onto PVDF membranes using Bio-Rad 
trans-blot protein transfer system. PVDF membranes were blocked with PBS buffer 
containing 0.1% Tween-20 and 5% non-fat milk. Blocked membranes were incu- 
bated with primary antibody diluted in blocking buffer and incubated overnight 
at 4°C rotating. Primary antibodies were detected using horseradish peroxidase 
(HRP)-conjugated secondary anti-mouse and anti-rabbit antibodies (Genetex) by 
incubating membranes at a 1:5,000 dilution for 1 h at room temperature. Antibody- 
bound proteins were detected by incubating with Pierce West Pico and Extended 
Duration Peroxide Solutions and visualized on film. Wild-type cells were treated 
with 101g ml~! tunicamycin and treated for 3-4h at 37°C, 5% CO}. To visual- 
ize proteins by immunoblotting, the following antibodies were used anti-SHBG 
(Genetex, GTX63795) at a dilution of 1:2,500. Anti-pSAP (Genetex, GTX101064) 
at a dilution of 1:2,500. Anti-HA C29F4 (Cell Signaling, 3724P) at a dilution of 
1:2,500. Anti-mouse M2-Flag (Sigma, F1804-200UG) at a dilution of 1:2,500. 
Anti-DYKDDDDK (Flag) (Cell Signaling, 2368) at a dilution of 1:2,500. Anti-RPN1 
(gift from M. Ivessa) at a dilution of 1:2,000. Anti-NS1 (Genetex GTX124280) 
at a dilution of 1:2,500. Anti-P84 (Genetex GTX70220) at a dilution of 1:3,500. 
Anti-DENV-ENV (Genetex GTX127277) at a dilution of 1:2,500. Anti-prM (Genetex, 
GTX128092) at a dilution of 1:2,500. Anti-NS2B (Genetex, GTX124246) at a dilution 
of 1:2,500. Anti-NS3 (Genetex, GTX124252) at a dilution of 1:2,500. Anti-RFK (Sigma, 
SAB1409492) at a 1:500 dilution. Anti-FLAD1 (Santa Cruz Bio, sc-376819) at a 1:250 
dilution. Anti-STT3B (Sigma, HPA036646) at a dilution of 1:1,000. Anti-MAGT1 
(Proteintech Group, 17430-1-AP) at a dilution of 1:1,000. Anti-RPS25 (Abcam, 
102940) at a dilution of 1:1,000; Anti- HUR (Santa Cruz, sc-5261) at a dilution of 1:200. 
Anti-SRRD (Sigma, HPA002945) at a dilution of 1:500. 

Lentiviral or retroviral complementations. Lentiviral or retroviral transduc- 
tion was used to create stable cell lines expressing a selected gene of interest. 
Respective genes of interest (see ‘Construction of lenti- or retroviral constructs’ 
section) were cloned into the pLenti-CMV-Puro-DEST vector (w118-1) (a gift 
from E. Campeau), or PMX-IRES-BLAST-DEST. Lentivirus or retrovirus pro- 
duced in HEK293FT cells was used to transduce respective cell lines overnight. 
Cells stably expressing the gene of interest were selected by treatment with 
1-4,1g ml! puromycin or 10-50j1gml blasticidin over 2 days (InvivoGen) along 
with untransduced cells as negative control. 

Genome engineering. CRISPR guide RNA sequences were designed using the 
Zhang laboratory CRISPR design tool (see Extended Data Fig. 3 for CRISPR target 
sites). Corresponding oligonucleotides or geneblocks containing U6 promoter 
sequence and U6 termination sequence were ordered from IDT. Oligonucleotides 
were cloned into the Zhang laboratory generated Cas9 expressing pX458 guide 
RNA plasmid (Addgene) as previously described using Gibson assembly reaction 


LETTER 


New England Biolabs. Geneblocks were cloned into pCR-Blunt II-TOPO vector 
(Life Technologies). TOPO-cloned geneblocks were co-transfected into respective 
cells with a mCherry-expressing construct and hCas9-expressing vector (Addgene 
41815 hCas9 Church pcDNA3.3-Topo) guide RNA encoded in the pX458 plasmids 
were transfected alone using Lipofectamine 2000 (Life technologies) according to 
manufactures guidelines. Transfected cells were single cell sorted based on GFP 
or mCherry expression into 96-well plates using BD influx cell sorter. Clonal 
cell lines were allowed to expand and genomic DNA was isolated for sequenced 
based genotyping of targeted allele. For this, a 500-700-base pair (bp) region that 
encompassed the guide RNA-targeted site was amplified and the PCR product was 
Sanger sequenced. In haploid cells (HAP1), only one mutated allele was present 
in the sequenced PCR product and cellular subclones containing aframe shift 
mutations or large indels were selected. In aneuploid Huh7.5.1 cells, we regu- 
larly observed that the PCR product contained more.than 1 trace, suggesting 
non-identical mutations in multiple alleles. Inthis case, the PCR product was 
cloned into a plasmid vector and colonies were sequenced to separate allele spe- 
cific mutations. Subclones were chosen where all alleles were mutated. It should 
be noted that in aneuploid Huh7.5.1 cells, we sometimes observed cellular sub- 
clones where all mutant alleles contained the same mutation (for example, CD81 
and ELAVL]). It has been reported that CRISPR/Cas9 technology can generate 
homozygous bi-allelic mutations more frequently than expected in diploid cells 
or cancer cells***°, perhaps because both alleles were independently repaired in 
an identical manner or because one allele served as a template for homology- 
directed repair of the other allele. To create KO cell lines using lentiCRISPRv2 
(Addgene) the following guide RNAs below were cloned into the vector, Huh7.5.1 
cells were lentivirally transduced and selected with Puromycin. The following 
guide RNA sequences were used: ANKRD49: AGAAAGGAGTCTCCGCACTG; 
ANKRD49 guide2: ATGAACCGTTACGTCAAACC; ANKRD49 guide3: 
GCCCAAAGAAGCAATCTGCT; ANKRD49 guide4: AGAAAGGAGTCT 
CCGCACTG; CD81: GCGCCCAACACCTTCTATGT; CLDN1: CGATGGCG 
CCGATCCATCCC; ELAVL1: TTGGGCGGATCATCAACTCG; ELAVL1 
guide2: TGTGAACTACGTGACCGCGA; ELAVL1 guide3: GGGCCT 
CCGAACCGTCGCGC; ELAVL1 guide4: AGAGCGATCAACACGCTGAA; 
EMC1: AGGCCGAATCATGCGTTCCT; EMC2: GATTGCCATTCGAAAA 
GCEC; EMC3: GTGCCACCTTCTCCTATGAC; EMC4: TGCTTGTCCAA 
GTAACCGAC; FKBPL: GTCAAGAAGATCGTAATCCG; FKBPL guide2: 
GAAGAGCCCGTCCATAGCAT; FKBPL guide3: ACAGAGCTAACT 
ATGGGCGT; FKBPL guide4: GTTTCGGTAGGAGGGTCTCG; FLAD1: 
ACAGACCATTGAGACCTCCC; FLAD1 guide2: CATGCGCATCAACC 
CACTGC; FLAD1 guide3: TACAGGAGTAGGGGTCAGTC; FLAD1 guide4: 
TGTGTCCCTGGGGGTTGAAG; MAGT1: GAGCGAACATGGCAGCGCGT; 
MIR-122: GAGTTTCCTTAGCAGAGCTG; MMGT1: CAGGCACTTACGCTG 
CGCAG; non-targeting: GCCCAGACGCCCTAGAATAG; OCLN: 
ACGTAGAGTCCAGTAGCTGC; OSTC: TCAGTCATAGAACCGACACT; 
PPIA: GTACCCTTACCACTCAGTCT; RFK: TATCATGCATACCTTCAAAG; 
RFK guide2: GGTCAAGTGGTGCGGGGCTT; RFK guide3: CTATGGGG 
AAATCCTCAATG; RFK guide4: CCAACCATAGTAAATACCAG; RPN2: 
TCGCTACCACGTGCCAGTTG; SRRD: GACTGTTCTCAGTGAGAACG; SRRD 
guide2: GATAGATACCTTTGCAATGT; SRRD guide3: ATTGAAGTCC 
TTAACACCCT; SRRD guide4: AACAACTGAAGGCCCCTGTG; SSR2: 
CAATAGCAGGGGGATGCCGA; SSR3: GACCCTAGTAAGCACATATT; STT3A: 
GTACTCACGGATCAAACTCA; STT3B: TACAGCAAAAGAGTCTACAT; ZEB 1: 
TGAAGACAAACTGCATATTG, 

TALENs targeting AUP! in HAP 1 cells were generated as indicated in Extended 
Data Fig. 3. Cells were co-transfected with left and right TALEN-containing con- 
structs and an mCherry-expressing construct using Lipofectamine 2000 (Life 
Technologies) according to manufacturer’s guidelines. Transfected cells were 
single-cell sorted based on mCherry expression into 96-well plates using BD influx 
cell sorter. Subclones were allowed to expand and genomic DNA was isolated 
for sequenced based genotyping of AUP! allele. HAP1 cells containing gene-trap 
insertions in STT3A, STT3B, RPN1, SSR2, SSR3, ASCC2 and RPS25 were isolated 
by picking resistant colonies from the DENV-2 haploid genetic screen. Picked 
colonies were screened for gene-trap insertions using PCR with primers directed 
to the gene-trap and the flanking region of the gene of interest. 
Co-immunoprecipitation. Wild-type HAP1 cells were transduced with STT3A- 
Flag, STT3B-Flag or RPS25-Flag lentiviral vectors (see ‘Construction of lenti- or 
retroviral constructs’ section). HAP1 cells expressing Flag-tagged proteins were 
trypsinized and washed once with PBS. Cells were lysed with TNM buffer (25 mM 
Tris-HCl, 15 mM NaCl, 5mM MgCl.) containing 1% digitonin, 1 mM PMSE, and 
Halt Protease and Phosphatase Inhibitor Cocktail (Life Technologies, 78440) for 
1h on ice gently vortexing every 15 min. Cell lysates were clarified by centrifu- 
gation at 15,000g for 10 min. Clarified lysates were incubated at 4°C overnight 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


with Anti-FlagG M2 Magnetic Beads (Sigma M8823-5ml). Incubated beads were 
washed three times with TNM buffer containing 0.1% digitonin, 1mM PMSF, 
1x halt protease and phosphatase inhibitor. Cells were washed once with TNM 
buffer and competitively eluted with TNM buffer containing 150ngyl~! 3x Flag 
Peptide (Sigma, F4799) for 30 min on ice. For immunoblotting elutions were dena- 
tured by boiling in 5 x sample buffer and analysed by SDS-PAGE using antibodies 
against DENV non-structural proteins. Elutions were also prepared for mass 
spectrometry analysis. Cells expressing RPS25-Flag, a host protein with a likely 
different mechanistic action, as well as untransduced cells, were used as a negative 
control in these experiments. 

SYPRO ruby staining. After electrophoresis gel was fixed for 30 min in fixative 
buffer (50% methanol, 7% acetic acid) and incubated with SYPRO Ruby Stain 
(Fischer, S-12000) overnight. Gels were then washed once with wash buffer 
(10% methanol, 7% acetic acid) and twice with distilled water. Gel was imaged on 
Molecular Dynamics Storm scanner. 

Construction of STT3A/STT3B double-knockout cell line with STT3B condi- 
tionally expressed using a Tet-On system. HAP cells stably transduced with the 
transactivator pLenti-CM VrtTA3G-Blast (R980-M38-658) (Addgene, 31797) with 
the endogenous STT3B gene disrupted by CRISPR/Cas9 were lentivirally trans- 
duced with pLenti-CMVTRE3G-Puro-STT3B-Flag, which drives STT3B under 
a doxycycline-inducible promoter. Transduced cells were then transfected with 
pX458 plasmid encoding a guide RNA targeted to the STT3A gene. Transfected 
cells were then subcloned based on GFP expression of PX458 plasmid into 96-well 
plates containing IMDM plus 10% FBS, penicillin-streptomycin, L-glutamine and 
25ng ml! doxycycline. Subclones were allowed to grow for 2 weeks, then replica 
plated and in one replicate the doxycycline medium was washed away and replaced 
with regular growth medium without doxycycline and incubated for 5 days. Cells 
that were dependent on doxycycline for growth were genotyped to verify both 
endogenous STT3A and STT3B had double frame shifting CRISPR/Cas9 editing 
events. STT3A/STT3B endogenous double knock out cells were then lentivirally 
transduced with wild-type or mutant STT3A or STT3B under the CMV promoter. 
The lentivirally transduced cell lines were then plated in 96-well plates and the 
doxycycline was washed away and incubated for 5 days. Cells were then fixed and 
stained with crystal violet to assess cell viability. 

Treatment of HCV infected cells with lumiflavin, FMN and FAD. Wild-type 
Huh7.5.1 cells were treated with lumiflavin (Santa Cruz Bio, sc-224045) ranging 
from 10 to 100M and infected with HCV or DENV-2 at a MOI of 0.1. For WNV, 
YFV, PV (polio virus), SINV, VEEV and HRV-14 cells were treated with 501M 
lumiflavin and infected at a MOI of 0.1. For rescue of HCV replication in lumifla- 
vin-treated cells 100\1M FMN and 10mM FAD were used. RFK=and FLAD1-KO 
Huh7.5.1 were cultured in absence or presence of 500,.M FMN (TCI America, 
RO0023) or FAD (Sigma, F8384) and subsequently infected with HCV at a MOI of 
0.1. After 3 days of infection, levels of infection were determined using immunoflu- 
orescence, western blot and qPCR. Anti-HCV core 1b (Abcam ab2740) was used at 
1:500 for immunofluorescence and 1:1,000 for western blotting. Anti: DENV-2 NS5 
(GeneTex GTX103350) was used at a 1:25,00 dilution for western blotting. For the 
lumiflavin treatment two independent experiments with triplicate infections were 
performed and one representative is shown. For the FMN/FAD complementation 
two independent experiments were performed and the average is shown. 

MTT assay. To test effects of lumiflavin on cell viability MTT assay was performed 
according to the manufacturer’s instructions (Sigma Cell Proliferation Kit I (MTT), 
11465007001). Three independent experiments in triplicates each were performed 
and one representative is shown. 

Transmission electron microscopy. Cells stably expressing STT3B-APEX2 were 
plated in 6-well plates and infected with an MOI 5 of DENV-2. 28h after infection, 
cells were washed with PBS and fixed with 2% glutaraldehyde in 100 mM sodium 
cacodylate, 2mM CaCl, pH 7.4, buffer. Cells were fixed at 4°C for 60 min then 
washed three times with PBS. Fixed cells were quenched with 100 mM sodium 
cacodylate, 2mM CaCh, pH 7.4, and 20 mM glycine. Quenched cells were washed 
twice with PBS and stained with using the KPL DAB reagent set (KPL, 54-10-00) 
for 8h. After incubation with DAB, cells were rinsed twice with PBS and scraped 
off well using a cell scraper and pelleted. Pelleted cells were re-suspended in 10% 
gelatin in 0.1 M sodium cacodylate buffer, pH 7.4, at 37°C and allowed to equil- 
ibrate for 5 min. Cells were pelleted again, excess gelatin removed, then chilled 
in cold blocks and covered with cold 1% osmium tetroxide (EMS, 19100) for 2h 
rotating in a cold room. They were then washed three times with cold ultrafiltered 
water, then en bloc stained overnight in 1% uranyl acetate at 4°C while rotating. 
Samples were then dehydrated in a series of ethanol washes for 20 min each at 4°C 
beginning at 30%, 50%, 70%, 95% where the samples were then allowed to rise 
to room temperature, changed to 100% ethanol twice, then propylene oxide for 
15 min. Samples were then infiltrated with EMbed-812 resin (EMS, 14120) mixed 
1:2, 1:1 and 2:1 with propylene oxide for 2h each with leaving samples in 2:1 resin 


to propylene oxide overnight rotating at room temperature in the hood. The sam- 
ples were then placed into EMbed-812 for 2-4h then placed into molds with labels 
and fresh resin, orientated and placed into a 65°C oven overnight. Sections were 
taken at approximately 80 nm, picked up on formvar/carbon-coated 100-mesh Cu 
grids, stained for 30s in 3.5% uranyl acetate in 50% acetone followed by staining 
in 0.2% lead citrate for 3 min. Observed in the JEOL JEM-1400 120kV and photos 
were taken using a Gatan Orius 4k X 4k digital camera. 

Mass spectrometry 

Liquid chromatography-tandem mass spectrometry. Elutions from co- 
immunoprecipitations were trypsin digested and purified using Sep Pak C18 
purification column. Peptides were analysed using an LTQ Velos Orbitrap 
mass spectrometer (Thermo Fisher Scientific) coupled to.an Agilent 1100 high 
performance liquid chromatography pump (Agilent Technologies) and a MicroAS 
autosampler (Thermo Fisher Scientific). Peptide mixtures were introduced into 
the mass spectrometer via a fused silica microcapillary column (100 j1m inner 
diameter) ending in an in-house pulled needle tip (internal diameter ~ 51m). 
Columns were packed to a length of 17cm with a C18 reversed-phase resin (Magic 
C18AQ; Michrom Bioresources). Peptides were loaded onto the column and then 
eluted into the nanospray ionization source of the mass spectrometer via a two-step 
gradient of 7-25% buffer B (2.5% water and 0.1% formic acid in acetonitrile (v/v)) 
in buffer A (2.5% acetonitrile and 0.1% formic acid in water (v/v)) over 60 min 
followed by a second phase of 25-45% buffer B over 20 min. Eluting peptides 
were measured by the LTQ Velos Orbitrap operating in a data-dependent mode in 
which 10 ion-trap MS/MS spectra were acquired per data-dependent cycle from a 
high-resolution (R= 60,000) precursor spectrum (mass range = 360-1,600 m/z). 
Mass spectrometry data processing. Raw data files produced by the mass spectrome- 
ter were converted to the mzXML format using in house software, MS Convert. MS 
and MS/MS data were extracted from mzXML files with in-house software. MS/MS 
spectra were analysed using Sequest algorithm searching a composite target-decoy 
protein sequence database. The target sequences consisted of human proteins down- 
loaded from the Uniprot database (11-17-2014) and protein sequence corresponding 
to the dengue virus 2 16681 polyprotein. Decoy sequences were created by reversing 
the orientation of all target sequences. Parameters used for all searches included the 
requirement of trypsin peptide cleavage, two missed cleavages allowed, peptide mass 
tolerance of 20 p.p.m., variable oxidation of methionine residues (+15.99491 Da), and 
static carbamylation modification of cysteine residues (+57.02146). Decoy peptide 
identifications guided the creation of filtering criteria delivering preliminary sets of 
peptide-spectrum matches with estimated false discovery rate <1%. Spectral counts 
for each condition were combined at a protein level and normalized by protein length 
to infer protein abundances in each case. 

DENV reporter virus and DENV replicon design and generation 

Construction of pDENV-Luc replicon. The design of the DENV replicon was based 
on DVRep described previously*®: The viral 5’ untranslated region (UTR) was 
followed by a duplication of the first 102 nucleotides of the C coding region, which 
contain cis-acting elements required for replication (CAE). The CAE was fused 
to the renilla luciferase coding region followed by the DENV open reading frame 
(ORE) starting at the signal peptide preceding NS1. Between the luciferase and the 
DENV structural proteins a foot and mouth disease virus (FMVD) 2A sequence 
was introduced to provide cotranslational cleavage and release of luciferase. The 
construct was based on pD2/IC-30P, which contains a full-length infectious clone 
encoding DENV-2 strain 16681 (ref. 37). We also included the amino acid muta- 
tion Q399H in the envelope protein. We gene-synthesized a fragment containing 
the T7 polymerase promoter sequence followed by the first 102 nucleotides of the 
C coding region in frame with Renilla luciferase and FMDV 2A followed by the 
DENV open reading frame (ORF) starting at the signal peptide preceding NS1 
until an internal Hpal site. This fragment was released by SacI (preceding the 
T7 promoter) and Hpal and cloned in pD2/IC-30P in a three point ligation with 
KpnI/Sacl and KpnI/Hpal fragments. 

Construction of pDENV-Luc infectious clone. The design of the DENV reporter 
was based on mDV-R described previously**: The viral 5’ UTR was followed by 
a duplication of the first 104 nucleotides of the C coding region, which contain 
cis-acting elements required for replication (CAE). The CAE was fused to the 
renilla luciferase coding region followed by the complete DENV ORF. Between 
the luciferase and the DENV structural proteins a FMVD 2A sequence was 
introduced to provide cotranslational cleavage and release of luciferase. The con- 
struct was based on pD2/IC-30P, which contains a full-length infectious clone 
encoding dengue virus serotype 2 strain 16681 (ref. 37) in which an envelope 
Q399H mutation was introduced that enhanced viral infection in mammalian 
cells using primers 5‘’-GGAAGTTCTATCGGCCACATGTTTGAGACAAC-3 
and 5'-GTTGTCTCAAACATGTGGCCGATAGAACTTCC-3’ via the 
QuikChange Site-Directed Mutagenesis kit (Agilent Technologies). We 
gene-synthesized a fragment containing the T7 polymerase promoter sequence 


© 2016 Macmillan Publishers Limited. All rights reserved 


followed by the first 104 nucleotides of the C coding region in frame with 
Renilla luciferase and FMDV 2A. This fragment was PCR amplified, intro- 
ducing a SacI site at the 5’ end and a Nhel site (present in the FMDV 2A 
sequence) at the 3/ end using primers: 5‘-CGAAATTCGAGCTCACGCG-3’ and 
5'-TCCTGCTAGCTTGAGCAAATCAAAGTTC-3’. To create and in frame 
fusion of FMDV 2A with the DENV ORF a second DNA fragment was amplified 
using pD2/IC-30P as template with primers: 5’-TCAAGCTAGCAGGAGACGT 
TGAGTCCAACCCCGGGCCCATGAATAACCAACGGAAAAAGGCG-3’ and 
5'-GGAAGAGCATGCAGTCGGAAATG-3’ introducing 5’ Nhel and 3/ SphI 
restriction sites. The two fragments were cut with the respective restriction 
enzymes and ligated into pD2/IC-30P cut with SacI and SphI to create pDENV- 
Luc. DENV-Luc virus was produced by cutting with Xbal to linearize plasmid and 
in vitro transcription performed of pDENV-Luc and transfection into BHK cells 
using Lipofectamine 2000. 

Construction of lenti- or retroviral constructs. PMX-IRES-BLAST-DEST was 
made by cutting pMXs-IRES-Blasticidin Retroviral Vector (Cell Biolabs, RT V- 
016) with SnaBI and the Gateway destination cassette (reading frame A) was blunt 
cloned in to this vector according to manufacturer's protocol (Gateway Vector 
Conversion System; Invitrogen, 11828-029) 

To generate a lentiviral construct expressing STT3A-Flag, Dharmacon 
cDNA BC020965 was used as template to generate a PCR product using 
primers 5'‘-CACCATGACTAAGTTTGGATTTTTGCG-3’ and 5/-TTACTT 
ATCGTCGTCATCCTTGTAATCTGTCCTTGACAAGCCTCGATT-3’ 

Amplified PCR product was then topo cloned into Gateway compatible entry 
vector pENTR/D-TOPO Cloning Kit (Life Technologies K2400-20) and gateway 
reaction (Life Technologies) used to insert into pLenti-CMV-Puro-Dest (w118-1). 

To generate a lentiviral construct expressing STT3B-Flag, Dharmacon cDNA 
BC052433 was used as template to generate PCR product using primers forward 
primer 5’-CACCATGTCTTGGTGGGATTATGGC-3’ and reverse primer 5'-TTA 
CTTATCGTCGTCATCCTTGTAATCAACAGTCTTCTTAGAGGTCTTCTT-3’. 
It should be noted that we used Mus musculus STT3B because we were unable to 
clone human STT3B. 

Amplified PCR product was then TOPO cloned into Gateway compatible entry 
vector pENTR/D-TOPO Cloning Kit (Life Technologies, K2400-20) and gateway 
reaction was used to insert into pLenti-CMV-Puro-Dest (w118-1). 

To generate an SHBG-expressing construct, SHBG was ordered as two 
geneblocks and used to generate PCR product with primers 5’-TGTG 
GTGGAATTCTGCAGATACCTGTGGTGGAATTCTGCAGATACC-3/ and 
5'-ATCCAGCACAGTGGCGG-3’. PCR product was Gibson cloned Gibson 
assembly reaction kit (New England Biolabs) into pLenti-CMV-Puro-Dest (w118-1) 
that was EcoRV digested. 

To generate a Flag3 x -RPS25 expression construct, entry vector PENTR- 
Flag3 x -RPS25 was generated as described® using the forward primer 
CACCATGGACTACAAAGACCATGACGG. PENTR-Flag3 x-RPS25 was then 
used in a Gateway reaction (Life Technologies) to introduce Flag3 x -RPS25 into 
PMX-IRES-BLAST-DEST retroviral expression construct. 

A construct expressing STT3B fused with APEX2 and Flag-tagged PCR prod- 
ucts were generated using the pLenti-STT3B expression construct described 
above and APEX2 Addgene plasmid 49386 (refs 40, 41) as templates to gener- 
ate PCR products using primers 5/“GACTCTAGTCCAGTGTGGTG-3’ with 
5’-AACAGTCTTCTTAGAGGTCTTC-3’ and 5’-GAAGACCTCTAAG 
AAGACTGTTATGGACTACAAGGATGACGA-3’ with 5‘-CGGCCGCCACT 
GTGCTGGATTTAGGCATCAGCAAACCCAAG-3’. PCR products were Gibson 
assembled Gibson assembly reaction kit (New England Biolabs) into pLenti-CMV- 
Puro-Dest (w118-1) that was EcoRV digested. 

To generatea STT3B-doxycycline-lenti construct, Dharmacon cDNA 
BC052433 was used as a template to generate PCR product using primers 
5/-CACCATGTETTGGTGGGATTATGGC-3’ and 5’-TTACTTATCGTCGTCA 
TCCTTGTAATCAACAGTCTTCTTAGAGGTCTTCTT-3’ 

Amplified PCR product was then TOPO cloned into Gateway compatible entry 
vector pENTR/D-TOPO Cloning Kit (Life Technologies, K2400-20), and gateway 
reaction was used to insert into the doxycycline inducible lentiviral vector pLenti 
CMVTRE3G Puro DEST (w811-1) (Addgene, 27565). 

To generate catalytic site mutants (Extended Data Fig. 5 and refs 15, 42), 
STT3A- and STT3B-expressing constructs, DNA fragments were generated using 
pLenti-STT3B-Flag described above as a template using pLenti-EcoRV primers 
and mutant primers to generate two PCR products (see primers below). Both PCR 
products for each mutation were Gibson cloned (Gibson assembly reaction kit; 
New England Biolabs) into pLenti-CMV-Puro-Dest (w118-1) that was EcoRV 
digested. 

Primers forward (F) and reverse (R) were as follows: pLenti EcoRV-F 
5'-GACTCTAGTCCAGTGTGGTG-3’, pLenti EcoRV-R 5’-ATCCAGAGGTTGAT 


LETTER 


TGTCGAG-3’. STT3A mutations: E63A-F 5’-CAGGTTCCTGGCTGAGGC 
CGGGTTTTATAAATTCCATAACTGG-3’, E63A-R 5‘-CCGGCCTCAGCC 
AGGAACCTG-3’; D167A-F 5’/-CTGTGGCTGGCTCCTATGCCAATGA 
AGGGATTGCCATCTTTTG-3’, D167A-R 5’-CATTGGCATAGGAGCCAGCC 
ACAG-3'; E351Q-F 5‘-CCATCATTGCTTCTGTGTCTCAGCATCAGCCC 
ACAACCTG-3’, E351Q-R 5’-ATGCTGAGACACAGAAGCAATGATGG-3’. 
STT3B mutations: D100A-F 5’-ATCATCCACGAGTTCGCCCCGTGGTTTAAC 
TATAG-3’, D100A-R 5/-CTATAGTTAAACCACGGGGCGAACTCGTG 
GATGAT-3'; D218A-F 5’‘-CAGTGGCGGGATCCTTTGCCAATGAAGGCATTG 
CCATT-3’, D218A-R 5‘-AATGGCAATGCCTTCATTGGCAAAGGATCCCG 
CCACTG-3’; E402Q-F 5’-CAATTATTGCATCAGTGTCTGAGCATCAGCCTA 
CGACATGG-3’, E402Q-R 5’-CCATGTCGTAGGCTGATGCTGAGACAC 
TGATGCAATAATTG-3’, 

ELAVLI fused with Flag cDNA was prepared from. total RNA of Huh7.5.1 
cells using Biorad RT Superscript, and ELAVL1 was PCR amplified with 
5'-TGTGGTGGAATTCTGCAGATACCATGTCTAATGGTTATGAAGACCA-3! 
and 5‘-CGGCCGCCACTGTGCTGGATTTACTTATCGTCGTCATCCTTGT 
AATCTTTGTGGGACTTG-3’. Next, using Gibson Assembly the PCR product 
was cloned into pLenti-CMV-Puro-Dest (w118-1) that was digested with EcoRV. 

To generate RPN1 fused to 2Strep cDNA, BC010839 was PCR amplified with 
5'-CACCATGGAGGCGCCAGCCGC-3’ and 5’-CTACAGGGCATCCAG 
GATG-3’. The amplified fragments were cloned into P-ENTR-D-Topo (Invitrogen) 
then a Gateway LR reaction was performed.to shuttle cDNA into pLenti-CMV- 
puro expression vector. 

To generate SSR2 fused to 2Strep cDNA, NM_003145.3 was PCR amplified 
with 5’/-CACCATGAGGCTGCTGTCATTTGTG-3’ and 5’-TCAGTTCTTC 
TTCGTTTTGGGAG-3’. The amplified fragments were cloned into P-ENTR-D- 
Topo (Invitrogen) then a Gateway LR reaction was performed to shuttle cDNA 
into pLenti-CMY-puro expression vector. 

To generate SSR3 fused to 2Strep cDNA, NM_003145.3 was PCR amplified 
with 5‘-CACCATGGCTCCTAAAGGCAGCTC-3’ and 5’-CTATTTGGA 
GCCAGTAGACAG-3’. The amplified fragments were cloned into P-ENTR-D- 
Topo (Invitrogen) then a Gateway LR reaction was performed to shuttle cDNA 
into pLenti-CMV-puro expression vector. 

To generate ASCC2 fused to 2Strep, BC025368 was PCR amplified 
with 5’-CACCATGCCAGCTCTGCCCCTGG-3’ and 5’/-TCAGGATGGG 
ATCATGCCTTTGCT-3’. The amplified fragments were cloned into P-ENTR-D- 
Topo (Invitrogen) then a Gateway LR reaction was performed to shuttle cDNA 
into pLenti-CMV-puro expression vector. 

To generate RPS25 fused to 2Strep, NM_001028 was PCR amplified 
with 5’-CACCATGGACTACAAAGACCATGACG-3’ and 5/-TTAATTA 
ACCTCGAGTTTAAACGCG-3’. The amplified fragments were cloned into 
P-ENTR-D-Topo (Invitrogen) then a Gateway LR reaction was performed to 
shuttle cDNA into pLenti-CMV-puro expression vector. 

To generate the UBE2J1 expression contruct, cDNA provided by R. Kopito 
was PCR amplified with 5‘-TGTGGTGGAATTCTGCAGATACCATG 
GAGACCCGCTACAACCTG-3’ and 5‘-CGGCCGCCACTGTGCTGGATTT 
ATAACTCAAAGTCAAATATGTATTC-3’. The amplified fragments were cloned 
into pLenti-CMV-puro expression vector by a Gibson Assembly reaction. 

To generate the SEL1L expression construct, cDNA provided by R. Kopito was 
PCR amplified with 5’-TGTGGTGGAATTCTGCAGATACCATGCGGG 
TCCGGATAGGGCTG-3’ and 5‘-CGGCCGCCACTGTGCTGGATTTAAA 
GTCTACTTACCAAAACCATG-3’. The amplified fragments were cloned into 
pLenti-CMV-puro expression vector by a Gibson Assembly reaction. 

To generate AUPI, glycerol stocks containing AUP1 cDNA in pENTR entry 

vector were ordered from Darmacon (OHS5894-99868092), and a Gateway LR 
reaction was performed to shuttle cDNA into pLenti-CMV-puro expression 
vector. 
Comparison of knockout screens to siRNA screens. To compare the top 
30 host factor genes of the knockout screens to results from previous short 
interfering RNA (siRNA) screens, we used the following data: (1) Sessions 
et al.”’ (supplementary table 2). To rank the list by strength of phenotype, we 
used the P value. If multiple siRNA sequences per gene were present, we used 
the one with the stronger effect. (2) Krishnan et al.'' (supplementary table 1). 
To rank the identified DENV host factor, column AM was filtered for ‘required 
by both WNV and dengue’ The remaining genes were sorted by ‘pooled siRNA 
Fold reduction of DENV’ (column AN). (3) Tai et al.?° (table $2). To sort by 
phenotype, we chose the validated genes scoring with at least two siRNAs and 
ranked by P value. (4) Li et al.”* (from dataset $1). To rank the genes, we took 
the mean of the average normalized percentage infected cells part one or two of 
the four siRNAs. The top 10 genes based on phenotype as explained above are 
shown in Extended Data Fig. 10. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


31. 
32. 
33. 


34. 


35. 
36. 


Sanjana, N. E., Shalem, O. & Zhang, F. Improved vectors and genome-wide 
libraries for CRISPR screening. Nat. Methods 11, 783-784 (2014). 

Luo, B. et al. Highly parallel identification of essential genes in cancer cells. 
Proc. Natl Acad. Sci. USA 105, 20380-20385 (2008). 

Berger, K. L. et al. Roles for endocytic trafficking and phosphatidylinositol 
4-kinase Ill alpha in hepatitis C virus replication. Proc. Natl Acad. Sci. USA 106, 
7577-7582 (2009). 

Canver, M. C. et al. Characterization of genomic deletion efficiency mediated by 
clustered regularly interspaced palindromic repeats (CRISPR)/Cas9 nuclease 
system in mammalian cells. J. Biol. Chem. 289, 21312-21324 (2014). 

Horii, T. et al. Validation of microinjection methods for generating knockout 
mice by CRISPR/Cas-mediated genome engineering. Sci. Rep. 4, 4513 (2014). 
Alvarez, D. E., De Lella Ezcurra, A. L., Fucito, S. & Gamanrnik, A. V. Role of RNA 
structures present at the 3'UTR of dengue virus on translation, RNA synthesis, 
and viral replication. Virology 339, 200-212 (2005). 


37: 


38. 


39. 


40. 


41. 
42. 


Kinney, R. M. et al. Construction of infectious cDNA clones for dengue 2 virus: 
strain 16681 and its attenuated vaccine derivative, strain PDK-53. Virology 
230, 300-308 (1997). 

Samsa, M. M. et al. Dengue virus capsid protein usurps lipid droplets for viral 
particle formation. PLoS Pathog. 5, e1000632 (2009). 

Fuchs, G. et al. Kinetic pathway of 40S ribosomal subunit recruitment to 
hepatitis C virus internal ribosome entry site. Proc. Nat! Acad. Sci. USA 112, 
319-325 (2015). 

Lam, S. S. et al. Directed evolution of APEX2 for electron microscopy and 
proximity labeling. Nat. Methods 12, 51-54 (2015). 

Martell, J. D. et a/. Engineered ascorbate peroxidase as a genetically encoded 
reporter for electron microscopy. Nat. Biotechnol. 30, 1143-1148 (2012). 
Jaffee, M. B. & Imperiali, B. Exploiting topological constraints to reveal buried 
sequence motifs in the membrane-bound N-linked oligosaccharyl transferases. 
Biochemistry 50, 7557-7567 (2011). 


© 2016 Macmillan Publishers Limited. All rights reserved 


DENV - GO process 


protein N-linked glycosylation via asparagine 

co-translational protein modification 

protein folding in endoplasmic reticulum 

SRP-dependent cotranslational protein targeting to membrane 
single-organism carbohydrate metabolic process 


homophilic cell adhesion via plasma membrane adhesion molecules 


-log (p-value) 


DENV - GO function 


dolichyl-diphosphooligosaccharide-protein glycotransferase activity 
oligosaccharyl transferase activity 


magnesium ion transmembrane transporter activity 


+ r T T + 1 


0 2 4 6 8 10 12 
-log (p-value) 


Scale (-log(p-value)) 


. = 


Me DENV 
Nucleus Mm HCV 


% infection relative to WT 


LETTER 


HCV - GO process 


flavin-containing compound biosynthetic process 


bicellular tight junction assembly 

entry into host cell 

regulation of viral life cycle 

regulation of viral process 

regulation of cell development 

regulation of RNA metabolic process 
regulation of stem cell proliferation 

negative regulation of transcription by glucose 


primary miRNA processing 


0 2 4 6 8 


-log (p-value) 
HCV - GO function 


primary miRNA binding 


ribonuclease III activity 


double-stranded RNA-specific ribonuclease activity 


double-stranded RNA binding 
GTP-Rho binding 


FMN adenylyitransferase activity 


-log (p-value) 


Me 


ER 100 
Golgi 

Mitochondria ; * 
Plasma membrane 1 

Cytoplasm 4 
0.1 

Secreted Mm DENV : 
Other/Unknown mm HCV 0.01 

0 20 40 60 
0.001 
£ 6 


% of top30 host factors 


“2 Va ) DY M A Vv 
CLEC FEEL ELE SP FP OMSK &e ‘ oe & 


DENV host factor KO HCV host factor KO 


Extended Data Figure 1 | Divergence of DENV and HCY host factors. 


a, Gene Ontology (GO) analysis for DENV and HCV CRISPR screens 
on the ranked gene lists. Curated (by redundancy) enriched GO terms 
are shown. A complete list of all enriched GO terms can be found in 
Supplementary Table 4. b, Distribution of the subcellular location of the 


30 most enriched host factors for DENV and HCV. c, Cross-comparison of 
the effects of DENV or HCV host factor knockout in Huh7.5.1 cells on the 
replication of DENV or HCV using reporter viruses expressing luciferase. 
Data are mean and s.d. for triplicate infections. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a DENV HCV 
Ranked Rank Rank Rank 
replicate 2 | replicate 3 
BB Top 0.1% 
LD Top 1% 
LD Top 5% 


ANKRD49 


HSPA13 


[sworn | aie 
[russ | 206 | 
[pwAoze | Ae 
[Tenens |o58 D 
marae | as | 


SEC61A1 


LYSMD3 


LEPROT 


TSEN15 


i 


b replicate 2 replicate 3 
e 
a+ ~4 ot 
: : Bll 
8 3. 9 3. a af 
DENV tt i 
o 2- o 2 1) 
c c ar 
g* a" $ 
"04 "oO 0 
5000 10000 15000 20000 5000 10000 15000 20000 5000 10000 15000 20000 
Gene Rank Gene Rank Gene Rank 
replicate 1 replicate 2 replicate 3 
~~ ut 3 = e 
2 2 2 4/8 
9 fe} o 6(|§ 
8 g 3° 8 3. 
Hov ij x { 
9 o* @ 2- 
c ir x 
ae S.. 
3 8 3! 
o 0- 
5000 10000 15000 2d000 5000 10000 15000 2d000 5000 10000 15000 20000 
Gene Rank Gene Rank Gene Rank 
Extended Data Figure 2 | Reproducibility of CRISPR screens. a, Ranked — on RIGER score for the individual replicate screens. Red dots highlight 
lists of the 30 most enriched DENV and HCV host factors and their where the 30 most significant host factors ranked in the individual 
rankings in the individual replicate screens. The colour code reflects in replicates. 


what percentile the gene scored in the replicate. b, Gene enrichment based 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


HAP1-STT3A-KO ‘TCT TCTGGGACAGTTAATTATGGAGAACATGCTIGACTTCTCAGTGTTGGGGCTAGCATT 
HAP1-STT38-KO _ ATTATGGCTATCAGATAGCTGCAATGCCTAATAGAACTACGTTGCTGGATAATAACACCT 
HAP1-RPN1-KO  _' TATAGAAGCCAGTTGTGATTTAATGAGCT IGT TACAGTATAAGTGGACCACAGCAGGAGC 
HAP1-SSR2-KO  _ GGCT'TGGAAGAATTTTAGAGTAGGGAAGTGCCAAGAATTGTGGAAAAACTAGGAGGGAAA 
HAP1-SSR3-KO - CGTTCATCGTGTCTGCCATCCCCATCTGTGAGTCCTGGCAGCGAGGCGGCTTAGGCAGCC 
HAP1-RPS25-KO GTTCTT'TTTCCCATTAGGATCATGAAAATGGGTCTCTTCTGCGAAGTGTCTGCCGCTGTG 
HAP1-ASCC2-KO  'TCTCTGAGGAGAGTAGTATTTAATTGAGAGACTAGAGGAATGATGACAAAGAGGCTGAGG 
HAP1-AUP1 


5 ' GIGGAGTCACTCAAGAGATTCTGTGCTTCCACGAGGCTTCCCCCCACTCCTCTGCTGCTATTICCCTGAGGAAGAGGCCAC 3' Target site 
Bi a aaa ee asec eda 3’ Mutant allele +260bp 


———— 26 0bp- Insertion 


5 AGCTTTTGTTCCCTTTAGTGAGGGTTAATTGCGCGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTA 
TCCGCTCACAATTCCACACAACATACGAGCCGGGAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACA 
TTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGG 
GGAGAGGCGGTTTGCA 3’ 


HAP1-UBE2J1 
5 ' ATGAAAGAAGCGGCAGAATTGAAAGATCCAACAGATCATTACCATGCGCAGCCTTTAGAGGTTAGTTTCTATCTCCATGT '3 


Target Site 
5 HS SPECULUM me eT nN re erresen oy PuEeCH REO nr "3 


Mutant Allele +1 


A-Insertion 
HAP 1-Double-STT3A/STT3B 


5 'ACCATGTACTCCATTTTTTCCACATCACCATCGACATTCGGAATGTCTGTGTGTTCCTGGCCCCTCTCTICTCCTCCTTC'’3 Target Site STT3A 


5 ACCATGTACTCCATTITTTC--------------------------------------------------------- TIC'3 Mutant Allele -57bp 
5 ' ACAGCCGGCACGGCCACCACGGGCCCGGGGCCCAGTGCGCGCACAAGGCGGCGGGCGGCGCGGCGCCGCCGAAGCCGGCC’3 Target Site STT3B 

5 ACAGCCGGCACGGCCACCACGGGCCCGGGGCCCAGTGCGCGCAC------------------------- CGAAGCCGGCC'3 Mutant Allele -25bp 
HAP1-SEL1L 


5 TCTCAGACTACTTTGACATCAGATGAGTCAGTAAAGGACCATACTACTGCAGGCAGAGTAGTTGCTGGTCAAATATTTCT 3’ Target Site 
PERE HERETO ERIS AOR ECR AAESAS ICR EeRneneenoenrine gee coen eenomtot echo er 3’ Mutant Allele +1lbp 


T-Insertion 


5' TCTCAGACTACTTTGACATCAGATGAGTCAGTAA------------~ TGCAGGCAGAGTAGTTIGCTGGTCAAATATTTCT'3 Mutant Allele -13bp 
Huh7-STT3A 
5’ ACCATGTACTCCATTTITICCACATCACCATCGACATICGGAATGTCIGIGIGTICCTECICCTIC( .»++)AGAGGAAAAAAAAACTACATGA 3' Target Site 


5’ ACCATGTACTCCATTTTTTCCACATCACCATC---------------=-~===5=------=-==- (e000 )onenenn= ATGAAACTACATGA 3! 
5 ACCATGTACTCCATTTTTTCCACATCACCATCGACATTCGGAATGTCTGIGTGTTCCTGCTCCTTC(....)AGAGGAAAAAAAAACTACATGA 3' 
5 ACCATGTACTCCATITTTTCCACATCACCATCGA-ATTCGGAATGTCTGIGIGITCCTGGCCCCTC(....)AGAGGAAAAAAAAACTACATGA 3' 


Mutant Allele -399bp 
Target Site 
Mutant Allele -lbp 


Huh7-STT3B 
5 AAACAGCCGGCACGGCCACCACGGGCCCGGGGCCCAGTGCECECACAAGECGGCGGGCGGCGCGGCGCCGCCGAAGCCGG 3’ Target Site 
icceaaiideaiaiimeiadiddia as “ie en sbciuieniiiciesiuaunaiaididl 3’ Mutant Allele +1bp 


A-Insertion 
5’ AAACAGCCGGCACGGCCACCACGGGCCCGGGGCCCAGTGCGCGCACAAGGCGGCGGGCGGCGCGGCGCCGCCGAAGCCGG 3’ Target Site 
DI ARACRECEEGCACGSTCALEACERS CCCs Uker ervmuun ev occ ener eaceraaueCEcoCanAnuLeS 3’ Mutant Allele +1bp 


T-Insertion 
Huh7-MAGT1 
5 AGATCCTGGCAAACTCCTGGCGATACTCCAGTGCATTCACCAACAGGATATTTTTTGCCATGGTGGATTTTGATGAAGGC 3’ Target Site 
5’ AGATCCTGGCAAACTCCTGGCGATACTCCAGTGCATTCA-CAACAGGATATTTTTTGCCATGGTGGATITTGATGAAGGC 3’ Mutant Allele -lbp 


5’ AGATCCTGGCAAACTCCTGGCGATACTCCAGTGCATTCACCAACAGGATATTTTTTGCCATGGTGGATTTTGATGAAGGC 3’ Target Site 
2 AGE Sera CCC ACTS CAaL SIE eae nT reer neel eon Lr nannees 3’ Mutant Allele +193bp 


193bp-Insertion 
5' TIGTGTTCATAGATATTTATGATGAGGACGCTCGTGCTTATTGGCAGGATTTTCAATCTTAAAGGAGTACTGATGCTGCAG 
ATAAGACTCAACTTTTTCTGACAATTTTTCTGCTACTTCCAGGAAGACTTGCCGGACGCTCCTTCTGGCTGCTGCCTCATAAA 
ACTCCAGCGCAGCTCCTTCAACACGGTCC 3’ 


WT KO 


WT KO 
[= 50kDa 75kDa 
prsos[= | MAGT1 RPN1 STT3B 250kDa 
pos| oom | 150kDa 


Extended Data Figure 3 | Genotyping of cell lines for DENV host 
factors. a, The site of gene-trap insertion in HAP1 cell lines was 
determined using a PCR-based method. Bases depicted in red are the 
flanking sequences upstream of the gene-trap insertion. Bases depicted 
in green are downstream gene-trap flanking sequences. b, TALENs were 
used to edit the genomic region of AUP1 in HAP1 cells. Bases depicted in 
red are the left TALEN-binding site, bases depicted in blue are the right 


© 2016 Macmillan Publishers Limited. All rights reserved 


TALEN-binding site. Bases depicted in green are the TALEN target site. 


Arrow indicates site of 260-bp insertion. c, CRISPR Cas9 nuclease was 


targeted to bases depicted in red in HAP1 cells. Editing events are depicted 
at the guide RNA target sites below the wild-type sequence. d, CRISPR 
Cas9 nuclease was targeted to bases depicted in red in Huh7 cells. Editing 
events are depicted at the guide RNA target sites below the wild-type 
sequence. e, Immunoblots of wild type and knockout cell lines. 


LETTER 


a b 
= 108 107 
§ > 106. 
PD VOD ressssssssesessvsssenessenccenssssnssssnssssssssssssressseeces > 10 
a ci. 
= 10° > 10 
2 Q 4 
8 103 10 
8 B 403 
§ 10? 10 
2 Es 
£ 10! 3 10 
Oo 
& 490 ND ND ND ND 10! 
GP LO ON Ma he 
ere & SEL Ko” ed) 
& eA & Coe KE ? & rs 
c d 408 
> 
z 
= 107 
o 
oO 
Cc 
oO 
2 
g 10° 
£ 
=] 
a | 
106 
e 
DENV-Luc 
1010) —= WT 108 
40°] — STT3A-KO 
= STT3B-KO Bs 
> 1084 210 
© aor o 
g g 
6 4 
5 10' 5° 
® 1054 7) 
2 2 
= 1044 10° 
+ 403 = 
102 102 
20 40 60 80 100 120 0 20. 40 


Hours post infection 


Extended Data Figure 4 | Validation of DENV host factor genes. 

a, Plaque-forming units (PFU) assay of DENV infection. ND, no plaques 
detected (threshold of detection is 6 PFUml!). b, DENV luciferase levels 
in HAP1 isogenic knockout cells complemented using lentiviral stable 
expression of corresponding genes. c, Crystal violet of complemented 
Huh7 knockout cells infected with DENV. d, DENV luciferase levels in 
Raji DC-SIGN cells with knockout in DENV host factors (lentiCRISPRv2). 


mi Knockout 
fm Addback 


¢ 


sS KAN A VY oh > o> Go oh ck 
SS ‘° x SOL SX eo » ® » » 


HCV-Luc 


Lumen 
annnnHRERAAHE 
RTI IN 


STT3A OST 
STT3B OST 


Cytoplasm 


STT3A _|RPN1 
RPN2_|OST4 


DDOST 
OSTC 


60 
Hours post infection 


80 100 120 


Empty denotes an empty vector control (expressing Cas9 but no guide 
RNA), and NT denotes a cell line expressing a non-targeting guide RNA. 
e, Time course of DENV and HCV expressing Renilla luciferase in Huh7 
knockout cells. f, Schematic diagram of the STT3A and STT3B isoforms. 
Gene names in red indicate OST subunits identified in the DENV screens. 
Data are mean and s.d. for triplicate infections. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 
Catalytic site 
D56, D154, E319 
ER Lumen 
Cytosol 
; D56 D154 E319 
Cc. lari pglb FNDQLMITTNDGYAFAEGAR YYNRTMSGYYDTDMLVLVLP MYFNVNETIMEVNTIDPEVF 
S. cerevisiae STT3 NYRATKYLVNNSFYKFLNWF YISRSVAGSYDNEAIAITLL..IHIPIIASVSEHQPVSWPAF 
H. sapiens STT3A  NYRTTRFLAEEGFYKFHNWF YISRSVAGSYDNEGIAIFCM NNIPIIASVSEHQPTTWSSY 
M. musculus STT3B IRFESIIHEFDPWFNYRSTH YISRSVAGSFDNEGIAIFAL IHIPIIASVSEHQPTTWVSF 
b 
Huh7-STT3A-KO Huh7-STT3B-KO 
Addback_wt___cat__- Addback wt cat__- 


250kDa 


Anti-FLAG saz off 
150kDa 
150kDa 
100kDa 


= — 


Extended Data Figure 5 | Catalytic site mutations introduced in sites that were mutated. The table specifies the amino acid position and 
mammalian STT3A and STT3B. a, Catalytic site amino acids highlighted _ the specific triple mutations that were made to abolish catalytic activity. 
in red as identified in the bacterial STT3 (Campylobacter lari pglb). Strong —_b, Huh7 STT3A and STT3B knockout cells expressing Flag-tagged STT3A 
conservation allows their identification in other species. Alignments of and STT3B wild-type and catalytic mutants. 

STT3 isoforms across different species highlight the conserved catalytic 


© 2016 Macmillan Publishers Limited. All rights reserved 


mi No Addback 
m STT3B-FLAG 
m STT3B-APEX2 


Uninfected without STT3B-APEX2 


DENV infected with STT3B-APEX2 


Huh7 Huh7-STT3B-KO 


Uninfected with STT3B-APEX2 


150kDa 
75kDa 
50kDa 
4 }a-NS4B 
2, 7, Re 
150kDa On. “Sy Ty 
< 
75kDa % % %% 
» o-FLAG 
g 
ance BEA Ee Re 
STTS3A= Green 
—_ = = =a-- STT3B= Red 
+DENV2 Uninfected 


Extended Data Figure 6 | Physical interaction between the OST 
complex and the replication complex of DENV. a, APEX2, a protein tag 
for electron microscopy was fused to the C terminus of STT3B enabling 
the imaging of subcellular protein localization by deposition of a polymer 
of 3,3’-diaminobenzidine (DAB). b, Luminescence of Huh7 STT3B- 
knockout cells complemented with STT3B-APEX2 and infected with 
DENV expressing Renilla luciferase. Data are mean and s.d. for triplicate 
infections. c, STT3B localizes on ER membranes in the vicinity of DENV- 
induced vesicle packets as shown by transmission electron microscopy 
micrograph of DENV-infected or uninfected Huh7 cells expressing 

the STT3B-APEX2 construct. N represents the cell nucleus and the 


arrowheads in samples transfected with STT3B-APEX2 represent APEX 
polymerized DAB staining in the lumen of the ER or around DENV- 
induced vesicle packets (VP). d, Co-immunoprecipitations of STT3A—Flag 
and STT3B-Flag from DENV- infected cell lysates. LE, long exposure. 

e, Anti-Flag western blots of immunoprecipitation elutions of DENV- 
infected cells stably expressing Flag-tagged STT3A, STT3B and RPS25. 

f, SYPRO Ruby staining of elutions and inputs of immunoprecipitations 

of DENV-infected cell lysates. g, Co-immunoprecipitation elutions of 
DENV-infected lysates were analysed by mass spectrometry and 
DENV-specific peptides aligned to DENV polyprotein. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 
oe sits cleavage site 
CD81 mutant Maw WVVWwWwi/\wal\als SRRD mutant 
CD81 WT SRRD WT 
CCGCGCCCAACACCTTCTATGTAGGTGAGTGCACA GTGTGACTGTTCTCAGTGAGAACGAGGTAAGTGG 
(csi crisp PAM) (SRRD CRISPR_ PAM) 
cleavage site cleavage site 
MiwinlwwWantonAafnwinanntn Wull\Wwnsnh/naehalinaiannla 
ELAVL1 mutant ANKRD49 mutant 
evavis wt WWW WWW WwW WwW ANKRD49 WT 
CGGTTTGGGCGGATCATCAACTCGCGGGTCCTCGT TCAGCTTACCACAGTGCGGAGACTCCTTTCTGAAA 
(LAVLI-CRISPR_ PAM) 
cleavage site cleavage site 
Y 
FKBPL mutant MW WWW tan diane af 
RFK WT 
TAGAAGTCCTCTTTGAAGGTATGCATGATATGTG FKBPLWT 2 GGCCACGGATTACGATCTTCTTGACAAAGCTC 
(PAM) RFK CRISPR 
cleavage site cleavage site 
n\Wwsl\wwiM\AdWtan ae eaaas a Nd WWW anssnanannnn 
FLAD1 mutant MIR122 mutant 
FLAD1 WT MIR122 WT 
CCCTACAGACCATTGAGACCTCCCTGGCTCAGTAC CAGAGTTTCCTTAGCAGAGCTGTGGAGTGTGACAA 
(LAI CRISPR_ PAM) 
oO oO @ 
. 0% ge oe os 
ve oO @ ge Fg 
WS SF WF OS 
a. © es & ay Ss & & NS 
& ¢ & ¢ x ¢ 9 
37 kD: 
evaviif@mm [O79 eek [ ge 20K2 radia Lo kia, SRRD [fm L., kDa 
75 kDa 75 kDa 75 kDa 75 kDa 
Extended Data Figure 7 | Analysis of HCV host factor knockout cell frameshifts. CD81 and ELAVL1 knockout cell lines are subclones, 
lines. a, Genotyping of CRISPR-induced knockout Huh7.5.1 cells by whereas others are populations of cells mutagenized with lentiCRISPRv2. 


Sanger sequencing showing the mutated locus and the wild-type reference. _ b, Immunoblots of CRISPR-induced knockout cells. 
CRISPR/Cas9 induces mutations close to the PAM site resulting in 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


» 
tom 
(r) 


@ wt 
ae MB guide c Mi ELAVL1-KO 
5 S 
a = guide2 = 100 
> @® guide3 @ 
5 MM guide4 ic 10 
3 @ 
2 <1 
E i 
I B04 
5 
* 0.01 
HCV DENV SINV PV 
d e 
HCV-Luc-Replicon (WT) HCV-Luc-Replicon (GND) 
8, 1075 107) — 
Y | = — Se KO 
— a — a : ELAVL1-KO 
S enews 5 10% aaieoihan 5 10°) = — ELAVL1-KOHELAVL1-FLAG Huh7.5.4 
a a a ———— 
ze e Saal +ELAVL1- 
8 8 10°5 Q FLAG 
oO oO 8 
3 3 10% ® 10% ps4 (= fam |-75%02 
i= i= = 
§ § : rl |i 
3 3 403 3 403 ™37kDa 
10? ; 1 1 1 1 1 102+ 1 1 + 1 
0 12 24 36 48 0 4 8 12 16 20 24 0 12 24 36 48 
hours post electroporation hours post electroporation hours post electroporation 
Extended Data Figure 8 | ELAVL1 is a critical host factor for HCV using wild-type sgJFH1 (left) or GND sgJFH1 replicon. e, Transfection of 
replication. a, HCV luciferase infection in knockout cell lines using four ectopically expressed ELAVL1 restores HCV replication. Western blot of 
different guide RNAs per gene. NT, non-targeting guide RNA. b, qPCR ELAVL1-Flag transfected and untransfected Huh7.5.1 ELAVL1-knockout 
of viral RNA in wild-type or ELAVL1-knockout Huh7.5.1 cells. ¢, Crystal cells. Data are mean and s.e.m. (qPCR) or s.d. (FFU, RLU) for triplicate 
violet assay for different RNA virus infections. d, HCV replicon assays infections, except in panel e, which was a single infection. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


uninfected +HCV +HCV+LF 


Lumiflavin (uM) 
ma WT pclae al 
@™ RFK-KO UT 10 50 100 
mm FLAD1-KO a-HCV core — 20 kDa 


3 =- 

cp 100 fv pudgessssseeeseresereenees uninfected  +HCV = +HCV+FMN +HCV+FAD 

B 

& o-p84 |e) — 75 kDa 

ge RFK- sosocner soem 

$ 104 KO 

Zz a-DENV NSS |—_— wh 

g 

> [ me FLAD1 qnapanas 

a 5 ae a-GAPDH ~~ a 
HCV DENV 

d e . 


@ untreated Tey 
EM + umifavin (60uM) = 100 rand 
<2 

Oe mee Comes ance On as eee 38 % FA 104 

BS o 2 6 

© £o 60 & 

o Q 2 g£ 14 

Pa =5 40 < 

> 104 9S Fd 

id oO x 20 f 0.14 

= S £ 

& = 

x 0 ‘ 1 r 1 x 0.014 

s 0 20 40 60° 80 100 ce « & Oo 

HCV WNV YFV SINV VEEV PV HRV-14 YX VY wt & 
Lumiflavin concentration (uM) = Ka Se 

Extended Data Figure 9 | Lumiflavin inhibits the replication of HCV NSS in untreated (UT) and lumiflavin-treated Huh7.5.1 cells. p84 and 
but not of other RNA viruses. a, GPCR of HCV or DENV RNA replication | GAPDH served as loading controls. d, qPCR of RNA viruses in untreated 
in wild-type, RFK-knockout or FLAD1-knockout Huh7.5.1 cells. or lumiflavin-treated Huh7.5.1 cells. e, MTT cell proliferation assay for 
b, Immunofluorescence of HCV infection in wild-type, RFK-knockout lumiflavin-treated Huh7.5.1 cells. f, Restoration of HCV replication in 
and FLAD1-knockout Huh7.5.1 cells under treatment with lumiflavin, lumiflavin-treated cells by exogenous addition of FMN or FAD. Data are 
FMN or FAD. HCV core protein (green). Blue denotes DAPI (nuclear) mean and s.e.m. (qPCR) or s.d. (MTT) for triplicate infections/treatments. 


staining. Scale bar, 57 1m. ¢c, Western blot for HCV core and DENV 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


DENV 
a 7 
CRISPR screen Haploid screen 
Ranked hits: 
1. MAGT1 Ranked hits: 
2. STT3B 1. STT3B 
3. SSR2 2. STT3A 
4. RPN2 3. RPN2 
5. STT3A 4. RPS25 
6. SSR3 5. VMP1 
7. EMC3 6. OSTC 
8. OSTC 7. SSR2 
9. SSRI 8. SSR3 
10. EMC1 9. RPN1 
10. ASCC2 
Ranked hits: 
Ranked hits: 1. SAP130 
1. ATP6V1B1 2. NIPA2 
2. ATP6V1B2 3)NDFIP1 
3. FLJ20254 4. SEC61G 
4, ATP6V1H 5. TRHDE 
5. SYNGR1 6. ATXN7L3 
6. ATP6VOB 7. PIAS2 
7. ZBTB41 8. ALS2CR7 
8. TAZ 9. MT2A 
9. NRG1 10. GFER 
10. PRR12 
DENV screen in mosquito cells Validated DENV factors 
and validated in human cells from WNV scr — 
(eessionsetal) (Krishnan et al.) 
b HCV 
siRNA replicon screen (Tai et al.) 
Ranked hits: 
1. COPB2 
2. COPA 
3. CHMP2A 
4. COPZ1 
5. ARCN1 
6. PI4AKA 
CRISPR screen 7. CKAPS 
8. CDC42 
Ranked hits: oem 
1.CLDNi 10. SLC34A2 


2. ELAVL1 
3. CD81 
4. OCLN 
5. PPIA 
6. SRRD 
7. FLAD1 
8. RFK 
9. mir-122 
10. ZEB1 


Ranked hits: 
1. SNF1LK2 
2. TMEM135 
3. SPCS1 
4. ANKRD20A1 
5. APC2 
6. KCTD14 
7. CDCA3 
8. GCAT 
9. ZNF337 
10. HIST1H2BK 


Ranked hits: 
1. ETF1 

2. FAU 

3. GPSN2 
4, PROX1 
5. CNOT3 
6. CNOT1 

7. PIAKA 

8. RAB10 

9. PRPF40A 
10. CCNB2 


siRNA screen for replication 
(Li et al.) 


siRNA screen for replication + assembly 


(Li et al.) 


Extended Data Figure 10 | Comparison of knockout screen results to 
previous siRNA screens. a, Venn diagram comparing the hits from the 
CRISPR and haploid screens for DENV host factors to previous siRNA 
screens from Sessions et al.’’ (from supplementary table 2) and Krishnan 
et al. (from supplementary table 1). The top ten validated host factors 


(by strength of phenotype in the validation screen) for each screen are 
shown next to the circle. b, Venn diagram comparing the hits from the 
CRISPR screen for HCV host factors to previous siRNA screens from 
Tai et al.?° (from table $2) and Li et al.?* (from dataset $1). 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


doi:10.1038/nature18625 


A CRISPR screen defines a signal peptide processing 
pathway required by flaviviruses 


Rong Zhang!, Jonathan J. Miner!, Matthew J. Gorman!, Keiko Rausch’, Holly Ramage’, James P. White!, Adam Zuiani', 
Ping Zhang!?, Estefania Fernandez!, Qiang Zhang!, Kimberly A. Dowd’, Theodore C. Pierson‘, Sara Cherry? & 


Michael S. Diamond!>-®7 


Flaviviruses infect hundreds of millions of people annually, and 
no antiviral therapy is available’. We performed a genome-wide 
CRISPR/Cas9-based screen to identify host genes that, when 
edited, resulted in reduced flavivirus infection. Here, we validated 
nine human genes required for flavivirus infectivity, and these 
were associated with endoplasmic reticulum functions including 
translocation, protein degradation, and N-linked glycosylation. 
In particular, a subset of endoplasmic reticulum-associated signal 
peptidase complex (SPCS) proteins was necessary for proper 
cleavage of the flavivirus structural proteins (prM and E) and 
secretion of viral particles. Loss of SPCS1 expression resulted in 
markedly reduced yield of all Flaviviridae family members tested 
(West Nile, Dengue, Zika, yellow fever, Japanese encephalitis, and 
hepatitis C viruses), but had little impact on alphavirus, bunyavirus, 
or rhabdovirus infection or the surface expression or secretion of 
diverse host proteins. We found that SPCS1 dependence could be 
bypassed by replacing the native prM protein leader sequences 
with a class I major histocompatibility complex (MHC) antigen 
leader sequence. Thus, SPCS1, either directly or indirectly via its 
interactions with unknown host proteins, preferentially promotes 
the processing of specific protein cargo, and Flaviviridae have a 
unique dependence on this signal peptide processing pathway. 
SPCS1 and other signal processing pathway members could 
represent pharmacological targets for inhibiting infection by the 
expanding number of flaviviruses of medical concern. 

We performed a genome-wide inhibition of West Nile virus (WNV)- 
induced cell death screen using the CRISPR/Cas9 system?” and 
lentiviruses targeting 19,050 genes (Extended Data Fig. 1a). Whereas 
in the absence of lentivirus transduction cells did not survive WNV 
infection, colonies of lentivirus-transduced cells survived; single 
guide RNAs (sgRNAs) were amplified by PCR and sequenced. We 
identified 12 genes that were statistically enriched using MAGeCK® 
(Supplementary Tables 1, 2). All 12 genes were endoplasmic reticulum- 
associated with annotated functions of carbohydrate modification, 
protein translocation and signal peptide processing, protein degrada- 
tion, and heat shock response (Fig. 1a). 

Invalidation studies, editing of nine genes resulted in reduced 
WNYV antigen expression following infection of 293T or HeLa cells 
(Fig. 1a, b) without causing cytotoxicity (Extended Data Fig. 1b). We 
confirmed the efficiency of gene editing for the proteins for which we 
could obtain validated antibodies (Extended Data Fig. 1c). Validated 
genes were tested for effects on related flaviviruses: Zika (ZIKV), 
Japanese encephalitis (JEV), Dengue serotype 2 (DENV-2), and yellow 
fever (YFV) viruses. Editing of six of these genes reduced infection by 
all four flaviviruses (Fig. 1c-f). Editing of STT3A, SEC63, SPCS1, or 
SPCS3 resulted in decreased yields of WNV and JEV (Fig. 1g, h). We 


observed less impact on unrelated positive- or negative-sense RNA 
viruses (Extended Data Fig. 1d). 

As pathogenic flaviviruses are transmitted by arthropods, we eval- 
uated the roles of orthologues of these genes in insect cells. Silencing 
of Drosophila orthologues reduced infection by WNV and DENV-2 
(Fig. 2a, b) without appreciably affecting cell viability (Fig. 2c). 
Decreased WNV infection was also observed in mosquito cells after 
gene silencing (Fig. 2d). Depletion of Spase22-23 (orthologue of SPCS3) 
in adult Drosophila led.to decreased WNV titres (Fig. 2e) and flies 
heterozygous for Spase12 (orthologue of SPCS1) showed reduced WNV 
infection (Fig. 2f). Overall, flavivirus infectivity in human and insect cells 
was dependent on analogous endoplasmic reticulum-associated genes. 

Trans-complementation of gene-edited human cells with wild-type 
alleles rescued flavivirus infectivity (Extended Data Fig. le-g). Since 
we identified the genes encoding two (SPCS1 and SPCS3) of the five 
components of the Signal Peptidase Complex”"®, and found that insect 
SPCS genes also affected flavivirus infection, we focused our study 
on these genes. Gene silencing in human cells confirmed that SPCS 
genes were required for optimal flavivirus but not alphavirus infection 
(Extended Data Fig. 2 and data not shown). 

We screened for clonal SPCS1 and SPCS3 knockout cells lines. 
Although we were unable to obtain SPCS3~/~ clonal lines, SPCS1~/~ 
293T or Huh7.5 cell clones grew, with both alleles containing nonsense 
deletions (Fig. 3a and Extended Data Fig. 3). WNV, DENV, JEV, YFV, and 
ZIKV failed to accumulate in the supernatants of SPCS1~'~ 293T cells 
(Fig. 3c-f), and WNV infectivity was restored in trans-complemented 
cells (Fig. 3h). However, SPCS1~'~ cells supported infection by alpha- 
viruses, bunyaviruses, and rhabdoviruses (Fig. 3i-k and Extended Data 
Fig. 3a). To corroborate these findings, we tested SPCS1~/~ Huh7.5 
cells and found reduced infection by WNV, ZIKV, JEV, and the related 
Flaviviridae member, hepatitis C virus (Extended Data Fig. 3e, f). In 
comparison, gene editing of the remaining SPCS genes, SEC11A and 
SEC11C, had minimal effects on infection (Extended Data Fig. 4). 

To determine whether SPCS1 was required for viral translation, 
replication, or both, we used wild-type and loss-of-function!" flavivirus 
replicons encoding reporter genes”” (Fig. 3b and Extended Data 
Fig. 5). Transfection of control cells with replicon RNA resulted in low 
levels of reporter gene activity over the first several hours, which reflects 
translation of input viral RNA, whereas subsequent signal increases are 
due to RNA replication. In SPCS1~/~ cells, high levels of reporter gene 
expression were observed, indicating that viral RNA translation and 
replication remained largely intact. 

We speculated that SPCS subunits, directly or indirectly, might 
regulate cleavage of the flavivirus polyprotein’’. Flavivirus structural 
(prM and E) and non-structural (NS1 and NS4B) proteins are cleaved 
by unknown endoplasmic reticulum host signal peptidase(s) (Fig. 31 


1Department of Medicine, Washington University School of Medicine, Saint Louis, Missouri 63110, USA. @Department of Microbiology, Perelman School of Medicine, University of Pennsylvania, 
Philadelphia, Pennsylvania 19104, USA. Department of Immunology, Institute of Human Virology, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China. 

Viral Pathogenesis Section, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA. Department of Molecular Microbiology, 
Washington University School of Medicine, Saint Louis, Missouri 63110, USA. °Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, Missouri 63110, USA. 
7The Center for Human Immunology and Immunotherapy Programs, Washington University School of Medicine, Saint Louis, Missouri 63110, USA. 


00 MONTH 2016 | VOL 000 | NATURE | 1 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


293T cells, WNV 


a M™ sgRNA1 b lM sgRNA1 
mm cgrnaz 2937 cells, WNV im S2RNAS 

SOO Bee creser ened owes dandudtaiuweudey oad da relies 100-4 pb------------- 

& 80 is = 80 * 

= # * = * 

Ss : oe 2 60 * 

Emel | ea P 3 en 

= . * * = 45 

ge rea ss g 

® ~ ft 

o 20 a @ 20 ae 

a 


° 


NaN exh ch ds 4 Bb xk 0 
S Pe SP 9 oe n> S & eo oP SY A ee 
& & Ss ss x 8 eo x & S ° 98 ww L ne S&S & & & S SS ? oa fo oy or & & ea 
ER translocation Oligosacchary| ERAD ER translocation Oligosaccharyl ERAD 
transferase transferase 
c 293T cells, ZIKV d 293T cells, JEV e 293T cells, DENV-2 
100 100 tape === =--22--- nen nneen eee en een neon: 
& = pe = 
a ieee boo woe 
8 60 3S 60 3 
= 2 2 
@ 40 = 40 jeu she sek a 
s 2 ee 
iy ict & 
@ 20 a * see 2 
> ai oF xO v ry \ © 
Sg Fik of AP Sg PP AO PN S 3) ej 
ES) PEL. ee ZO ? oe & ror S eg vo as AYP OLY PE SS S GS en » we 
ER translocation Oligo- ERAD ER translocation Oligo- = ERAD ER translocation — Oligo- ERAD 
saccharyl saccharyl saccharyl 
transferase transferase transferase 
f 298T cells, YFV 3. 20st eels, WNV h 298T cells, JEV 
120 & 7 
S 10041. .-----------2-22eeee eee eee e ‘ 
5 = gq = WT 
& 80 
8 e a £ 5 + STT3A 
c-Me| eae lll rs E 3 SEC61B 
2 40 E 24 4 SEC63 
& tone & 8 -# SPCS1 
@ 20 3 -* SPCS3 
2 


OG on SS cP ok Sy 0 24 48 72 
SG XS LS SS OE EAN Hours after infection 
ER translocation —Oligo- ERAD 
saccharyl 


transferase 


0 24 48 
Hours after infection 


Figure 1 | Genes required for flavivirus 
infection. a, b, Genes were selected for 
validation based on statistical analysis 
(Supplementary Tables 1 and 2). Gene-edited 
293T (a) and HeLa cells (b) were infected with 
WNYV and analysed 12 h later for E protein. 
c-f, Effect of gene editing on ZIKV (c), JEV 
(d), DENV-2 (e), and YFV (f) infection in 
293T cells. The results are the average of 

two or three independent experiments. 

g, h, 293T cells expressing indicated sgRNAs 
were infected with WNV (g). or JEV (h) 

and virus yield was determined. One of two 
independent experiments performed in 
triplicate is shown. Statistical significance 

was determined by ANOVA with a multiple 
comparisons correction (*P < 0.05, **P< 0.01, 
** PD < (0.0001; a-f). Error bars indicate 

s.e.m. ER; endoplasmic reticulum; ERAD, 
endoplasmic reticulum-associated degradation. 


and refs 14, 15). Gene-edited 293T cells were infected with WNV 
or JEV, and lysates were analysed. Reduced levels of E and prM pro- 
teins were found in SPCS1~/~ clones and SPCS1 or SPCS3 bulk gene- 
edited cells 12h after infection, and by 24h higher molecular mass 


a WNV.Kunjin, DL1 cells b 


Control B-gal siRNA 
Spase12 (SPCS1) 


Spase25 (SPCS2) Spase25 (SPCS2) 


Spase22-23 (SPCS3) 
CG1518 (STT3A) 
Sec61p (SEC61B) 
Sec63 (SEC63) 
Srp72 (SRP72) 

tws (PPP2R2D) 


Spase22-23 (SPCS3) 
0G1518 (STT3A) 
Sec61f (SEC61B) 
Sec63 (SEC63) 
Srp72 (SRP72) 

tws (PPP2R2D) 


0 03 06 0.9 1.2 1.5 
Fold change in infection 


d e 
WNV Kunjin, AAG2 cells 
Control B-gal siRNA 


bands reacted with anti-E or anti-prM/E antibodies!® (Fig. 3m, n and 
Extended Data Figs 3g, 6a, b). We next examined whether SPCS1 is 
required for cleavage of the viral non-structural proteins NSI-NS2A, 
2K-NS4B, or NS2B-NS3. In SPCS17~'~ cells, infection with WNV 


Control B-gal siRNA 
Spase12 (SPCS1) 


DENV-2, DL1 cells c 


0 0.3 0.6 0.9 1.2 1.5 1.8 
Fold change in infection 


Inducible RNAi in flies 


Sec63 (SEC63) +e hs-GAL4>+ 
Sec61f (SEC61B) * * 
Srp72 (SRP72) Rake hs-GAL4>Spase22-23 
Spase25 (SPCS2) toe (SPCS3) IR 
0 0.5 1.0 1.5 0 0.5 1.0 15 
Fold change in infection Normalized PFU per whole fly 
WNV Kunjin 


Figure 2 | Endoplasmic reticulum-associated genes are required for 
flavivirus infection of insect cells. a, b, Drosophila DL1 cells were treated 
with dsRNA and infected with WNV (Kunjin) (a) or DENV-2 (b) for 

30h. Gene names of human orthologues are given in parentheses. The 
percentage of infected cells was normalized to the control 8-galactosidase 
dsRNA. The data are expressed as the mean normalized value + s.d. 
Statistically significant differences were determined by Student's t-test 
(**P < 0.01; ***P < 0.001; ****P < 0.0001) and were compared to control 
dsRNA. The data are pooled from four experiments in duplicate. c, Cell 
viability. DL1 cells were treated with dsRNA and processed 30h later. 


2 | NATURE | VOL 000 | 00 MONTH 2016 


Cell viability, DL1 cells 


Spase12 (SPCS1) 
Spase25 (SPCS2) 
Spase22-23 (SPCS3) *e 

CG1518 (STT3A) 
Sec61p (SEC61B) 
Sec63 (SEC63) 
Srp72 (SRP72) 

tws (PPP2R2D) 


0 03 06 0.9 1.2 1.5 
Relative cell viability 


Infection in flies 


Wild-type*/* 


Spase12 (SPCS1)*~ 


0 2 4 6 8 10 
PFU per whole fly 
WNV Kunjin 


d, Aedes agypti AAG2 cells were treated with dsRNA, infected with WNV 
(Kunjin) for 30h, and processed for viral antigens. e. SPCS1 silenced 


Drosophila (Hs-Gal4>UAS- 


Spase22-23 IR (inverted repeat)) or sibling 


controls were infected with WNV (Kunjin) and titres measured 7 days 
later. The fold-change in titres of pools of ten flies from three experiments 


is shown (normalized mean 4 
or Spase12(EY10774)*'~ sibl 


i s.d., *P< 0.05 by Student's t-test). f, Wild-type 
ing flies were infected with WNV (Kunjin) 


and titres measured 7 days later. Data from pools of five flies in three 
independent experiments is shown (*P < 0.05 by Student’s t-test). 


© 2016 Macmillan Publishers Limited. All rights reserved 


9 
s 


Replicon studies 


DENV-2 


Control 
sgRNA (clone) 


log, RLU 
(per [ug protein) 
ort MW EH 


—_ = s5°s! 


C=]=NHRAANO 


0 20 40 60 80 
Hours after transfection 


-e Control cells: YFV-WT rep 
-= SPCS1~ cells: YFV-WT rep 
-o Control cells: YFV-GVD rep 
a. SPCS1~ cells: YFV-GVD rep 


0 24 48 72 96 120 144168 
Hours after infection 


er B-actin 


-= Control = Control 
-« SPCS1~ (clone 1) 


-e- SPCS1~ (clone 2) 


=h 


0 24 48 72 96120144168 
Hours after infection 


-e SPCS1~- (clone 1) 
-e SPCS1~ (clone 2) 


LETTER 


Figure 3 | SPCS1 is required for flavivirus 
protein processing and infection. a, Western 
blotting of SPCS1~'~ 293T cells. b, Cells were 
transfected with YF V-luciferase replicon RNA 
(wild-type GDD or loss-of-function GVD). 
Firefly luciferase activity was measured and 
normalized to intracellular protein levels. 
The data reflect the average of two or three 
independent experiments performed in 
duplicate. c—h, Cells were infected with WNV 
(c, h), DENV-2 (d), JEV (e), YEV (£) or ZIKV 


0 24 48 72 96 120144 168 
Hours after infection 


-= Control 
-e SPCS1~- (clone 1) 
-s SPCS1~ (clone 2) 


YFV g ZIKV hh sewnvespcsitg sii CHIKV j RVFV (g), and viral yield measured; In h, cells were 
6 ° 8 : trans-complemented with an SPCS/ or control 
L 2 %¢ t i 2 Es plasmid. Results are the average of two or 
Ra ze ° 2:2 ze r : three independent experiments performed 
22 2? mee 2 So in triplicate. i-k, Cells were infected with 
24 a1 3% 2 24 CHIKV (alphavirus), RVFV (bunyavirus), 
OS meas aetotatce OS Gp de tp wees b Gadatedetbotus °o  Ba da 2 «tb Ba de 7a Se tz0 © or VSV (rhabdovirus) and viral yield was 
Hours after infection Hours after infection Hours after infection Hours after infection Hours after infection measured, Results are the average of two or 
= ae eae = genie es * Control + vector = oe me = coer Kees three independent experiments performed in 
-© SPCS1~- (clone 2) “& SPCS1~- + SPCS1 tg triplicate. 1, The polyprotein processing strategy 
1 v4 Ff wah a of flaviviruses!*. Red and blue arrows indicate 
sites of cleavage by host and viral (NS2B- 
opr OB NST NS2A:NS2B: NSS: 'NS4A NS4B. “NSS NS3) proteases, respectively. m-o, Control or 
k ‘ey oma! Expoccd MH EV.6h Wy oan P cuny,12n SPCSis!— 293T (m, 0) or Huh7.5 (n) cells were 
19 i na win RVFVit6.h infected with WNV (m, 0) or JEV (n). Lysates 
% 8 os A ay a were blotted with (m) anti-WNV E, (n) anti-JEV 
e 8 _ phi  eeaous r 85 - 150 E, or (o) anti-WNV NS1 monoclonal antibodies. 
24 65 ee s Higher molecular mass bands (Ebi, Emed and 
83 ore page 4 NS1") that react with anti-flavivirus monoclonal 
a 50-\m antibodies are indicated. One experiment of three 
premitalanrehy is shown. p, 293T cells were infected with CHIKV 
jours after infection 
= Control or RVFV. Lysates were blotted with anti-CHIKV 


-« SPCS1~ (clone 1) 


Control 
SPCS1-/- 
SPCS1-- 

Control 
SPCS1-/- 


resulted in decreased expression of NS1 and the accumulation of 
higher molecular mass bands (Fig. 30). We detected lower levels of 
NS4B protein in SPCS1~/~ cells; in transfection studies with a tagged 
2K-NS4B plasmid, a higher molecular mass band was observed. For 
NS1-NS2A and NS3, we did not detect aberrant cleavage (Extended 
Data Fig. 6). We also tested the effects on HCV E2 glycoprotein and 
found decreased levels in SPCS1~'~ cells (Extended Data Fig. 7). In 
comparison, alphavirus or bunyavirus glycoproteins, which also require 
endoplasmic reticulum processing!”!*, showed intact expression in 
SPCS1~'~ cells (Fig. 3p and Extended Data Fig. 3b, c). 

To isolate the effects of the SPCS complex from infection, we trans- 
fected a prM-E plasmid, which produces subviral particles (SVPs)””. 
Immunoblotting of cell lysates for Eand prM proteins showed reduced 
levels and higher molecular mass bands in SPCS1- or SPCS3-deficient 
cells, and these changes correlated with a reduction in the number of 
SVPs (Extended Data Fig. 8a—c). We tested whether cleavage of flavi- 
virus protein signal sequences depended on SPCS1. We transfected 
WNYV structural (capsid (C), prM, M, E) and secreted non-structural 
(NS1) genes with native or MHC class I (K°) signal sequences into 
SPCS1~~ cells, and evaluated protein expression (Fig. 4). 

Expression of C protein from a C-prM-E plasmid was equiv- 
alent in control and SPCS1~/~ cells, although in the absence of the 
viral protease, C did not migrate at its normal size (Extended Data 
Fig. 8d). However, cleavage of the downstream proteins prM and E was 
reduced in SPCS1~‘~ cells. When NS2B-NS3 was supplied in trans, 
C was cleaved from prM-E and accumulated at the correct size in 
control and SPCS1~/~ cells. Thus, expression or cleavage of C is not 
affected by SPCS1. 

We next evaluated expression of prM and M. When the native prM 
leader sequence was used, expression of prM and its furin-cleavage 
product M was reduced in SPCS1~‘~ cells (Fig. 4a, groups 1 and 3). 
Substitution of the K leader rescued prM and M expression in 


E2 or anti-RVFV Gn monoclonal antibodies. One 
experiment of two is shown. For gel source data, 
see Supplementary Fig. 1. FFU, fluorescence- 
focus forming unit; PFU, plaque-forming unit. 


Control 
SPCoI 


SPCS1~‘~ cells only when prM was on a separate plasmid (Fig. 4a, 
group 2) but not as a prM-E plasmid (Fig. 4a, group 4). Thus, specific 
leader sequences determine the dependence of prM and M protein 
expression on SPCS1, and downstream proteins can modulate pro- 
cessing efficiency. 

When E was transfected, its expression was largely independent 
of SPCS1 or the K® leader sequence (Fig. 4b, groups 1 and 2). When 
E was cloned downstream of prM, accumulation of E was not detected 
in SPCS1~'~ cells (Fig. 4b, groups 3 and 4). This finding suggested 
that the native leader sequence of E was not cleaved in SPCS1~'~ cells 
when presented as an ‘internal’ leader sequence or that epistatic effects 
of the upstream prM protein reduced the stability of E protein. To test 
which of these possible explanations was correct, we performed *°S 
pulse-chase studies in prM-E-transfected cells. In control cells, only 
a single E protein band was visible, indicating rapid prM-E cleavage. 
However, prM-E and E bands were both present in SPCS1~'~ cells 
(Fig. 4c, top) and remained in an endoplasmic reticulum-resident form 
(Fig. 4c, bottom). A short 3-min *°S pulse showed a delay in the cleavage 
of prM-E in SPCS1~~ cells (Fig. 4d). 

We assessed the expression of NS1, which also requires endoplasmic 
reticulum-dependent signal sequence cleavage. When NS1 was trans- 
fected into cells, SPCS1 was not required for expression (Fig. 4e, group 1). 
When NS1 was cloned downstream of E (Fig. 4e, groups 2 and 3) 
or prM-E (Fig. 4e, groups 4 and 5), NS1 levels were reduced in 
SPCS1~‘~ cells. After blotting with an anti-NS1 monoclonal antibody, 
a 90-kDa band was visible in blots from SPCS1~‘~ cells (Fig. 4e, group 2), 
which probably represented uncleaved E-NS1; this result was corrob- 
orated by blotting for E protein (Fig. 4b, groups 5 and 6). Thus, place- 
ment of the NS1 leader sequence into an internal position rendered it 
more dependent on SPCS1 for cleavage. 

Flavivirus SVPs can be produced after transfection of prM and E on 
single or separate plasmids”””!. Transfection of prM-E encoding native 


00 MONTH 2016 | VOL 000 | NATURE | 3 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a Anti-prM/M b 


: Control _ SPCS1~~- d 
Anti-E | | 
M, 140- 
a 1 2 3 4 vi r aes =< prM-E transfection 
= rq a. 3 4 Hi anti-E IP 
115- A ~ prM-E 
25- a -prM 80- -E-NS1 Ea 3 min 55S pulse 
70- i eo on ae ig - - + + 
18: = Bal £ ~ Endo H 
4 =f sh si sk sa i M, 115 - 
| a oo -M BL gt BL Bk BL gt 25- 
EH EH SH EHR EH EH 15- 
= a = a S 8 S S fo} S 
St Bi BL BE Op Of OR OR OF OF Time(h) 012 4 01 2 4 
ES ER ER ES GB a) G GB G GB , 
62 62 S88 SS Sie) ees leeea -f-actin r Control spest- - prM-E (CHO) 
ee 1 Eleader (IE quel - prM-E 
Pe 2 ee 20. 


3. pr leader 


1 rM leader M 
p few] 4 K°-E leader 


2 kK? leader [pi 
3 prVE leader [aa] 
4 KE leader §(caat 9) 


5 E-NS1 leader §( EST] 
GKCNSI leader ® (ST) 


eB + BG 
15- —_— 
ey a 
5 5 
A UA t A t A 
: NoTx EndoH Nox |PNGaseF 
e Lysate — Anti-NS1 Supernatant f ‘i BNGueae Enaati 
M, 1 2 3 4 5 1 ap 1 2 3 7 Levels of E protein 
80- -E-NS1 (SVPs) in supernatant 
70 Anti-prw/M 25 -prM 12 “ 
50. -NS1 NS1- 15 ie we = 
40. - > 
ee 10. E 1.0 & 
EL £282 Stet per S 
€G ER ER ER ES == 2 0.8 
fo} fo} fo} fo} fo} n x YU. = 
8&8 8888 888 88 z 
G G G G G 5 § 06 ; 
2 bs 
5 
a) eee) ee ee -f-actin S04 5 
Q 
1 NS1 leader (I NST B02 | 
i 
2. E-NS1 leader Se 4 0 — — 
Ot ft St Bt St St St ee I pr (prM leader) + E (E leader, WT 
Ke-NS1 leader (ST ER EGER ER ERED ERE prM (pr leader) +E (E leader, 
. oe — 83 88 88 83 8388 88 88 A li prM (prM leader) + E (WT leader), SPCS1~- 
4prV-E-NS1 leader G6 GG GG GF GG > Mil IM (prM leader) + E (K® leader), WT 
5 K°-E-NS1 leader [pM J NS] we we we ee we ee Se ee factin Hl prM (prM leader) + E (K® leader), SPCS1~ 
1 prM leader +Eleade (ES) 3 Ml prM (Kk? leader) + E (E leader), WT 
2 prMleader §UpwWr] +K° leader (oF 1) prM (K® leader) + E (E leader), SPCS1~- 
3 K? leader §[prt] +Eleader (7 E+?) 4 1 prM (kK? leader) + E (K® leader), WT 
g prM-FLAG transfection 4 K® leader [Cpr] 4K? leader (7d  prM (K® leader) + E (K® leader), SPCS1~- 
amtrELAG IP 5 prM-Eleader §Cpw [Po Ed 5 Ml prE (native leaders), WT 
3 min °°S pulse 6 K-Eleader §Opw EY) prM-E (native leaders), SPCS1~~ 
EndoH: — - ist 5 bs 7 F leader | 6 Ml pr (K®-prM; E-E), WT 
M. 40- 8 K leader (TE 7) 


h ER lumen 


pri/M 


-prM 


™@) prM-E (kK>-prM; E-E), SPCS1~~ 
7 Ml EE leader), WT 

ll EE leader), SPCS1~- 
3 x= E (K? leader), WT 


IE (K? leader), SPCS17~ 
NS4A 


- prM (uncleaved, CHO) 
25 - Br (anes ) 
-: prM (uncleaved) 


Cytoplasm 


SPCS1~~ 


=» SPCS1-dependent signal sequence 


>» Signal sequence affected by upstream SPCS1-dependent cleavage 
—» Viral NS2B-3 protease-dependent cleavage 
—> Unknown signal peptidase pathway 


Figure 4 | SPCS1 is required for cleavage of the C-prM leader peptide 
and internal leader sequences. a, 293T cells were transfected with prM 

or prM-E plasmids containing native (C-prM (green box) and prM-E 
(blue box)) or K> (red box) leader sequences. Blots of lysates were probed 
with anti-prM/M monoclonal antibodies. One experiment of three is 
shown. b, 293T cells were transfected with E, prM-E, and E-NS1 plasmids 
containing native leader sequences (as in a and E-NS1 (orange box)) or 

a K? leader. Blots of lysates were probed with anti-E monoclonal antibodies. 
A higher molecular mass band corresponds to uncleaved E-NS1. One 
experiment of two is shownvc, d, 293T cells were transfected with a 
prM-E plasmid. After 24h, cells were labelled for 40 min (c) or 3 min 

(d) with *S cysteine-methionine. Lysates were immunoprecipitated 

with an anti-E protein monoclonal antibody before SDS-PAGE. c, Top, 
cysteine-methionine was added for chase times (0-4h). ¢ (bottom) and 

d, Immunoprecipitates were untreated or treated with Endo H or PNGase 
F for 1h at 37°C. One experiment of two is shown. e, 293T cells were 
transfected with NS1, E-NS1, or prM-E-NS1 plasmids containing native 


or K® and native internal signal sequences resulted in loss of expres- 
sion of prM and E or SVPs in SPCS1~‘~ cells (Fig. 4f, groups 5 and 6). 
When prM and E were co-transfected, the proteins were detected in 
SPCS1~'~ cell lysates (Fig. 4f, groups 1 and 2) and supernatant, albeit 
at lower levels. In SPCS1~'~ cells, prM negatively affected E but not NS1 
production (Fig. 4f (compare groups 1, 2, and 7) and Extended Data 
Fig. 8e), possibly because of its chaperone-like function for E protein”®. 
Defects in co-expression of prM and E in SPCS1~‘~ cells were corrected 


4 | NATURE | VOL 000 | 00 MONTH 2016 


viral or K° leaders. Blots of lysates or supernatants were probed with 
anti-NS1 monoclonal antibodies. One experiment of two is shown. f, 293T 
cells were transfected with prM + E, prM-E, or E plasmids containing 
viral or K® leaders. Left, blots of lysates were probed with anti-prM/M or 
anti-E monoclonal antibodies. One experiment of two is shown. Right, 
levels of SVPs in supernatant at 24h. Data are pooled from independent 
experiments performed in triplicate (**P < 0.01, ***P < 0.001, 

****P < 0.0001; unpaired t-test). g, 293T cells were transfected with 
prM-Flag. After 24h, cells were labelled for 3 min with *°S cysteine- 
methionine. Lysates were immunoprecipitated with anti-Flag protein 
monoclonal antibodies. h, Model of processing of flavivirus structural and 
non-structural proteins based on infection and transfection studies and 
the literature!*!>, Arrows indicate cleavage sites requiring SPCS1, sites 
affected by upstream SPCS1-dependent events, sites cleaved by the viral 
NS2B-NS3 protease, and sites cleaved via an SPCS1-independent pathway. 
For gel source data, see Supplementary Fig. 1. 


by inserting the K? leader sequence in front of the prM gene (Fig. 4f, 
groups 3 and 4). A 3-min *°S pulse and immunoprecipitation experi- 
ment in SPCS1~/~ cells showed an uncleaved form of prM (Fig. 4g). 
To assess whether host surface proteins require SPCS1 for signal 
peptide processing, we profiled SPCS1~/~ Jurkat T cells. Whereas 
ten antigens showed no difference in surface expression, levels of 
CD49d-CD29, ULBP1, and HLA-E were reduced by two-to-threefold 
(Extended Data Fig. 9a-c). A decrease in surface expression of ULBP1 


© 2016 Macmillan Publishers Limited. All rights reserved 


has been reported in cells deficient in SPCS1 or SPCS2 expression”’, 
although this phenotype was not explored. In an unbiased approach, 
we analysed the secretome of SPCS1~/~ 293T cells by mass spectrome- 
try. Of the approximately 245 secreted proteins identified, only 35 were 
downregulated in SPCS1~'~ cells, and the fold-changes were small 
(Extended Data Fig. 10 and Supplementary Table 4). We validated 3 of 5 
as being reduced in supernatants of SPCS1~'~ cells (Supplementary 
Table 5). Despite profound effects on flavivirus protein processing, an 
absence of SPCS1 only modestly affected the expression of host proteins. 

The differential requirement of SPCS1 for viral and host protein 
processing suggests that components of the SPCS complex in mamma- 
lian and probably insect cells facilitate the cleavage of particular signal 
peptides in specific contexts. There may be additional requirements for 
some viruses, as interactions between SPCS1 and the HCV NS2 and E2 
proteins have been reported’. 

A recent study performed an analogous CRISPR-based screen with 
WNV"“. Endoplasmic reticulum-associated genes were identified 
that prevented WNV-induced cell death. We identified three of these 
genes (EMC4, EMC6, and SEL1L), as did an siRNA screen”. Virtually 
all human gene ‘hits’ identified in our screen had insect orthologues 
required for optimal flavivirus infection. A subset of our genes were also 
identified in RNAi screens in Drosophila cells”®’’. The endoplasmic 
reticulum is a focal site in the flavivirus lifecycle because it supports 
translation, polyprotein processing, replication, and virion morphogen- 
esis. The identification of host gene targets that are selectively required 
for flavivirus infection but not cell survival provides intriguing candi- 
dates for pharmacological inhibition. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 11 November 2015; accepted 6 June 2016. 
Published online 17 June 2016. 


1. Diamond, M. S. & Pierson, T. C. Molecular insight into Dengue virus 
pathogenesis and its implications for disease control. Cel! 162, 488-492 
(2015). 

2. Weaver, S.C. et al. Zika virus: History, emergence, biology, and prospects for 
control. Antiviral Res. 130, 69-80 (2016). 

3. Cong, L. et al. Multiplex genome engineering using CRISPR/Cas systems. 
Science 339, 819-823 (2013). 

4.  Jinek, M. et al. RNA-programmed genome editing in human cells. eLife 2, 
e00471 (2013). 

5. Shalem, O. et a/. Genome-scale CRISPR-Cas9. knockout screening in human 
cells. Science 343, 84-87 (2014). 

6. Koike-Yusa, H., Li, Y., Tan, E. P., Velasco-Herrera, Mdel. C. & Yusa, K. Genome- 
wide recessive genetic screening in mammalian cells with a lentiviral 
CRISPR-guide RNA library. Nat, Biotechnol. 32, 267-273 (2014). 

7. Wang, T., Wei, J. J., Sabatini, D. M.& Lander, E. S. Genetic screens in human 
cells using the CRISPR-Cas9 system» Science 343, 80-84 (2014). 

8. Li, W. et al. MAGeCK enables robust identification of essential genes from 

genome-scale CRISPR/Cas9. knockout screens. Genome Biol. 15, 554 (2014). 

9. Evans, E. A., Gilmore, R. & Blobel, G. Purification of microsomal signal 

peptidase as a complex. Proc. Nat! Acad. Sci. USA 83, 581-585 (1986). 

10. Meyer,H.A. & Hartmann, E. The yeast SPC22/23 homolog Spc3p is essential 

for signal peptidase.activity. J. Biol. Chem. 272, 13159-13164 (1997). 

11. Khromykh, A.A., Kenney, M. T. & Westaway, E. G. trans-Complementation of 

flavivirus RNA polymerase gene NS5 by using Kunjin virus replicon-expressing 

BHK cells. J. Virol. 72, 7270-7279 (1998). 

12. Jones, C. T., Patkar, C. G. & Kuhn, R. J. Construction and applications of yellow 

fever virus replicons. Virology 331, 247-259 (2005). 

13. Lindenbach, B. D., Murray, C. L., Thiel, H. J. & Rice, C. M. in Fields Virology 
Vol. 1 (eds Knipe, D. M. & Howley, P. M.) 712-746 (Lippincott Williams & 
Wilkins, 2013). 


LETTER 


14. Chambers, T. J., Grakoui, A. & Rice, C. M. Processing of the yellow fever virus 
nonstructural polyprotein: a catalytically active NS3 proteinase domain and 
NS2B are required for cleavages at dibasic sites. J. Virol. 65, 6042-6050 
(1991). 

15. Falgout, B., Pethel, M., Zhang, Y. M. & Lai, C. J. Both nonstructural proteins 
NS2B and NS3 are required for the proteolytic processing of dengue virus 
nonstructural proteins. J. Virol. 65, 2467-2475 (1991). 

16. Throsby, M. et al. Isolation and characterization of human monoclonal 
antibodies from individuals infected with West Nile Virus. J. Virol. 80, 
6982-6992 (2006). 

17. Barth, B. U., Wahlberg, J. M. & Garoff, H. The oligomerization reaction of the 
Semliki Forest virus membrane protein subunits. J. Cel! Biol. 128, 283-291 
(1995). 

18. Ldber, C., Anheier, B., Lindow, S., Klenk, H. D. & Feldmann, H. The Hantaan virus 
glycoprotein precursor is cleaved at the conserved pentapeptide WAASA. 
Virology 289, 224-229 (2001). 

19. Schalich, J. et al. Recombinant subviral particles from tick-borne encephalitis 
virus are fusogenic and provide a model system for studying flavivirus 
envelope glycoprotein functions. J. Virol. 70, 4549-4557 (1996). 

20. Lorenz, |. C., Allison, S. L., Heinz, F. X. & Helenius, A. Folding and dimerization of 
tick-borne encephalitis virus envelope proteins prM and Ein the endoplasmic 
reticulum. J. Virol. 76, 5480-549 1.(2002), 

21. Hanna, S. L. et al. N-linked glycosylation of west nile virus envelope proteins 
influences particle assembly and infectivity. J. Virol. 79, 13262-13274 
(2005). 

22. Gowen, B. G. et al. A forward genetic screen reveals novel independent 
regulators of ULBP1,an activating ligand for natural killer cells. eLife 4, 
e08474 (2015). 

23. Suzuki, R. et al. Signal peptidase complex subunit 1 participates in the 
assembly of hepatitis C virus through an interaction with E2 and NS2. PLOS 
Pathog. 9,e1003589 (2013). 

24. Ma, H. et al. ACRISPR-based screen identifies genes essential for West-Nile- 
Virus-induced cell death. Cell Rep. 12, 673-683 (2015). 

25. Krishnan, M. N. et a/. RNA interference screen for human genes associated 
with West Nile virus infection. Nature 455, 242-245 (2008). 

26. Yasunaga, Avet al. Genome-wide RNAi screen identifies broadly-acting host 
factors that inhibit arbovirus infection. PLOS Pathog. 10, e1003914 
(2014). 

27. Sessions, O. M. et al. Discovery of insect and human dengue virus host factors. 
Nature 458, 1047-1050 (2009). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements This work was supported by NIH grants U19 Al083019 
(M.S.D.), U19 Al106772 (M.S.D.), RO1 Al104972 (M.S.D.), and T32 Al007163 
(E.F.) and by the Washington University Institute of Clinical and Translational 
Sciences (UL1 TROO0448 from the National Center for Advancing Translational 
Sciences and P41 GM103422-35 from the National Institute of General Medical 
Sciences). T.C.P and K.A.D are supported by the intramural program of NIAID. 
We thank R. Kuhn, A. Garcia-Sastre, H. Zhao, D. Fremont, X. Wang, and R. 
Townsend for reagents, experimental advice, and data analysis; P. Erdmann- 
Gilmore, R. Connors, Y. Mi, and H. Lin for expert technical assistance; and 

X. de Lamballerie and the European Virus Archive goes Global (EVAg) for 
consenting to the use of H/PF/2013 ZIKV strain for this study under a material 
transfer agreement with the EVAg parter, Aix-Marseille Université. 


Author Contributions R.Z. performed the primary CRISPR/Cas9 screen and 
viral protein transfection experiments. Validation studies in cells with different 
viruses were performed by R.Z., P.Z., MJ.G, A.Z., E.F., and M.S.D. J.J.M. performed 
the pulse-chase experiments and mass cytometry. J.P.W. performed studies 
with replicons. K.R. and H.R performed the insect cell and insect experiments. 
Q.Z. performed the secretome analysis and mass spectrometry. R.Z., S.C., T.C.P., 
and M.S.D. designed the experiments. K.A.D. and T.C.P. provided key reagents. 
Q.Z. performed data analysis. S.C. and M.S.D. wrote the initial draft of the 
manuscript, with the other authors contributing to editing into the final form. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of the 
paper. Correspondence and requests for materials should be addressed to 
M.S.D. (diamond@borcim.wustl.edu). 


Reviewer Information Nature thanks W. Wei and the other anonymous 
reviewer(s) for their contribution to the peer review of this work. 


00 MONTH 2016 | VOL 000 | NATURE | 5 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


METHODS 

Cells and viruses. Vero, BHK21, HeLa, U2OS, Huh7.5, and 293T cells were cultured 
at 37°C in Dulbecco's Modified Eagle Medium (DMEM) supplemented with 10% 
fetal bovine serum (FBS). C6/36 Aedes albopictus cells were cultured at 28°C in 
L15 supplemented with 10% FBS and 25 mM HEPES pH 7.3. Drosophila DL1 
cells were cultured at 28°C in Schneider’s medium supplemented with 10% FBS 
as described”*. Jurkat cells were cultured at 37°C in RPMI 1640 supplemented 
with 10% FBS and 10mM HEPES pH 7.3. All cell lines were originally acquired 
from American Type Culture Collection or colleagues (Huh7.5) and were 
tested and judged free of mycoplasma contamination. The following viruses were 
used in screening and validation studies: WNV (New York 2000), WNV (Kunjin), 
JEV (14-14-2 vaccine and Bennett strains), DENV-2 (16681 and New Guinea C 
strains), ZIKV (H/PF/2013), YFV (17D vaccine), CHIKV (2006 La Reunion OPY1), 
LACV (original strain), VSV (Indiana), HCV (J6/JFH), and SINV (Toto). With the 
exception of HCV (see below), all other viruses were propagated in BHK21, Vero, 
or C6/36 cells and titrated by standard plaque or focus-forming assays”’. 

Viral growth analysis. 293T or Huh7.5 cells were infected with WNV (multiplicity 
of infection (MOI) 0.01), JEV (14-14-2 strain, MOI 0.05 or 0.5; Bennett strain, MOI 
0.05), DENV-2 (MOI 3), YFV (MOI 1), ZIKV (MOI 0.05), CHIKV (MOI 0.01), 
SINV (MOI 0.01), RVFV (MOT 1), or VSV (MOI 0.01). After 2h of incubation, 
cells were washed three times and samples were titrated on Vero cells. For HCV 
growth analysis, control and SPCS1 gene-edited Huh7.5 cells were inoculated at 
an MOI of 1 with virus derived from a growth-adapted JFH-1 infectious clone*’. 
Cells were rinsed 6h after infection to remove unbound virus and samples were 
collected every 24h for 7 days. Viral titres in the supernatant were quantified by 
focus-forming assay, as described previously"). 

Pooled sgRNA screen and data analysis. A pooled library encompassing 122,411 
different sgRNAs against 19,050 human genes was derived by the Zhang laboratory” 
and obtained from a commercial source (Addgene). The library was packaged 
using a lentivirus expression system and 293T cells were transfected using 
FugeneHD (Promega). Forty-eight hours after transfection, supernatants were 
collected, clarified by centrifugation (3,500 rpm x 20 min), filtered, and aliquotted 
for storage at —80°C. 

For the screen, we generated clonal 293T-Cas9 cells by transfecting the 
lentiCas9-Blast plasmid (Addgene 52962) using FugeneHD transfection reagent 
(Promega), blasticidin selection, and limiting dilution. These 293T-Cas9 cells were 
transduced with lentiviruses encoding individual sgRNAs at an MOI of 0.3. After 
selection with puromycin for 7 days, ~2 x 108 cells were infected with WNV (MOI 
of 1) and then incubated for 2-3 weeks. In parallel, untransduced 293T-Cas9 cells 
were infected to ensure virus-induced infection and cell death. The experiments 
were performed in parallel as either duplicate or triplicate technical replicates in 
two independent biological experiments. 

Genomic DNA was extracted from the uninfected cells (5 x 10”) or the cells 

(3 x 10”) that survived WNV infection, and sgRNA sequences were amplified®, and 
subjected to next generation sequencing using an Illumina HiSeq 2500 platform. 
The sgRNA sequences against specific genes were recovered after removal of the 
tag sequences using the FASTX-Toolkit (http://hannonlab.cshl.edu/fastx_toolkit/) 
and cutadapt 1.8.1. 
Gene validation. The cut-off for candidate gene ‘hits’ was made using a pub- 
lished computational tool (MAGeCK)* and reflected sequencing data showing 
multiple different sgRNAs per gene, the number of sequencing reads per gene, 
and the enrichment of a given sgRNA compared to the uninfected cell library 
(Supplementary Tables 1, 2). From this, we identified 12 genes that showed statis- 
tically significant enrichment compared to uninfected cells. These candidate genes 
were tested for validation by using 3-5 independent sgRNAs per gene from the 
parent library and.cloning them into the plasmid pSpCas9(BB)-2A-Puro (PX459) 
V2.0 (Addgene plasmid 62988). The control sgRNAs were used from the par- 
ent library. Plasmids were transfected into 293T or HeLa cells using FugeneHD 
transfection reagent and puromycin was added one day later. Three days later, 
puromycin was removed, and cells were allowed to recover for three additional 
days before infection with different viruses. 

For flow cytometric analyses, gene-edited 293T cells were infected with WNV 
(MOI15, 12h), JEV (MOI 50, 22h), ZIKV (MOI 10, 30h), DENV-2 (MOT 3, 32h), 
YFV (MOT 3, 38h), CHIKV (MOI 1, 6h), SINV (MOI 10, 6h), LACV (MOI5, 
6h), or VSV-GFP (MOT 1, 5.5h). Gene-edited HeLa cells were infected with WNV 
(MOI 3, 24h). At the indicated times, cells were fixed with 1% paraformaldehyde 
(PFA) diluted in PBS for 20 min at room temperature and permeabilized with 
Perm buffer (HBSS (Invitrogen), 10 mM HEPES, 0.1% (w/v) saponin (Sigma), 
and 0.025% NaN; (Sigma)) for 10 min at room temperature. Cells then were rinsed 
one additional time with Perm buffer, transferred to a U-bottom plate, and incu- 
bated for Lh at 4°C with 1 ,gml“! of the following virus-specific antibodies: WNV 
(human E16 (ref. 33)); DENV2 (mouse E18 (ref. 34)); JEV (mouse E18 (ref. 34)); 
YFV (mouse E60 (ref. 34)); CHIKV (CHK-11 (ref. 35)); SINV (ascites fluid, ATCC 


VR-1248AF), LACV (807-31 and 807-33, gift from A. Pekosz). After washing, cells 
were incubated with an Alexa Fluor 647-conjugated goat anti-mouse or anti-human 
IgG (Invitrogen) for 1h at 4°C. Cells were fixed in 1% PFA in PBS, processed on 
a FACS Array (BD Biosciences), and analysed using FlowJo software (Tree Star). 
Validation also was performed by an infectious virus yield assay. Bulk 
gene-edited 293T cells were infected with WNV (MOI 0.01) or JEV (MOI 0.5). 
Supernatants were collected at specific times after infection and focus-forming 
assays were performed in 96-well plates as described previously**. Following infec- 
tion, cell monolayers were overlaid with 100.1 per well of medium (1 x DMEM, 
4% FBS) containing 1% carboxymethylcellulose, and incubated for 22h (WNV) 
or 36h (JEV) at 37°C with 5% CO. Cells were then fixed by adding 100,11 per 
well of 1% PFA directly onto the overlay at room temperature for 40 min. Cells 
were washed twice with PBS, permeabilized (in 1 x PBS, 0.1% saponin, and 0.1% 
BSA) for 20 min, and incubated with antibodies specific for WNV (humanized E16 
(ref. 33)) or JEV (mouse E18 (ref. 34)) E glycoprotein for 1 hat room tempera- 
ture. After being rinsed twice, cells were incubated with species-specific HRP- 
conjugated secondary antibodies (Sigma). After further washing, foci were 
developed by incubating in 5011 per well of TrueBlue peroxidase substrate (KPL) 
for 10 min at room temperature, after which time cells were washed twice in water. 
Well images were captured using Immuno Capture software (Cell Technology Ltd), 
and foci counted using BioSpot software (Cell Technology Ltd). 
Insect cell and fly infections..dsRNAs were generated as described*”. To silence 
genes using RNAi, insect cells were passaged into serum-free medium containing 
dsRNAs targeting the indicated genes. Cells were serum-starved for 1h, after 
which complete medium was added and:cells were incubated for 3 days. Cells 
were infected with WNV (Kunjin strain) at an MOI of 4 or DENV-2 (NGC strain) 
at an MOI of 1 for 30h and then processed for microscopy with automated image 
analysis as described*®. Control (hs>+) or Spcs3-depleted (hs>Spase22-23 IR 
(Bloomington)) 4=7-day-old female flies were heat shocked (37°C) for 1h for 
three consecutive days to deplete the gene of interest and challenged with WNV 
(Kunjin) (5 PFU). At day 7 after infection, pools of 10 flies were crushed and titred 
by plaque assay. Three independent experiments were performed. Heterozygous 
flies (Spase12(EY10774)) were outcrossed to wild-type flies and either wild-type 
or Spase12 heterozygous sibling controls were challenged with Kunjin for 7 days 
and groups of 5 flies were titred. 
siRNA treatments in human cells. Human U2OS cells were transfected with 
siRNAs against control, SPCS1, SPCS2, or SPCS3 genes for three days, infected 
with WNV (Kunjin strain) or DENV (MOI 1) for 18h, and then processed for 
microscopy with automated image analysis as described**. 
Replicon transfection and analysis. Two types of replicons were used. 
SP6-generated YFV replicons. The wild-type and NS5 polymerase mutant 
(GDD—GVD) YFV replicons (YF-FFLuc2A, wild-type and GVD) have been 
published previously”? and were a gift from R. Kuhn. Capped replicon RNA was 
generated using SP6 polymerase with an mMESSAGE mMACHINE kit according 
to the manufacturer’s instructions (Thermo Fisher Scientific). RNA was purified 
using an RNEasy kit (Qiagen) and 21g was transfected into control or SPCS1~/~ 
Huh7.5 cells using Lipofectamine 3000 according to the manufacturer’s instruc- 
tions (Thermo Fisher Scientific). At specified times, cells were collected, lysed, 
and processed for firefly luciferase activity using a commercial kit (Promega). 
Cleared lysates were tested for Fluc activity using the Dual-Luciferase Reporter 
Assay System (Promega) and the protein concentration was quantified using a BCA 
assay kit (ThermoFisher). Fluc activity (relative light units, RLU) was normalized 
by subtracting background luminescence of transfected cells collected at the time 
of transfection, then the adjusted RLU was divided by the total protein content 
(in jug) to yield RLU per xg protein. 
cDNA launched WNV replicons. The construction of wild-type and NS5 polymerase 
mutant (GDD-—+GVD) WNV replicons (lineage I, strain New York 1999) was based 
ona previously described cDNA launched molecular clone system*’. The back- 
bone of this strategy, a plasmid containing a truncated WNV genome under the 
control of a CMV promoter (pWNV-backbone), was designed to be complemented 
via ligation of a structural gene DNA fragment; transfection of pWNV-backbone 
alone does not result in production ofa self-replicating RNA molecule. Using overlap 
extension PCR and unique restriction endonuclease sites, pWNV-backbone 
was modified by the introduction of a fragment downstream of the CMV pro- 
moter encoding [5’/UTR-cyclization sequence of capsid-FMDV2a protease- 
signal sequence of E-NS1] to complement the NS2—+NS$5-3’/UTR already present in 
the pWNV-backbone plasmid, generating the replicon plasmid pWNVI-rep. The 
reporter gene GFP was then cloned upstream of the FMDV 2a protease sequence via 
a unique Mlul site to generate pWNVI-rep-GFP. The construction and organiza- 
tion of this WNV lineage I replicon is analogous to a previously described lineage 
II WNV replicon (pWNVIIrep-GFP). Finally, QuikChange mutagenesis (Agilent 
Technologies) was used to delete the enhancer portion of the CMV immediate early 
enhancer/promoter, generating pWNVI-minCMV-rep-GFP, and to generate the 


© 2016 Macmillan Publishers Limited. All rights reserved 


GDD-—GVD NSS polymerase variant. Although the CMV enhancer—promoter 
combination commonly found in cloning vectors results in robust and constitutive 
expression, inclusion of only the minimal CMV promoter (no enhancer) results in 
low-level expression“!. As such, direct transfection of pWNVI-minCMV-rep-GFP 
results in a dim GFP signal, which reflects translation of the RNA generated by 
DNA-dependent RNA translation. RNA polymerase-dependent replication of the 
wild-type (but not GVD mutant) replicon results in higher production of GFP 
over time. The eGFP is bracketed by the FMDV 2a autocleavage site, and does not 
rely on host or viral proteases for processing. Wild-type and NS5 GVD variants of 
PWNVI-minCMV-rep-GFP (200 ng) were transfected into 10* control or gene- 
edited 293T cells (96-well plates) using Lipofectamine 2000. At various times after 
transfection, cells were collected, cooled to 4°C, stained sequentially with a bioti- 
nylated anti-9NS1 (ref. 42) (or biotin anti-CHIKV negative control monoclonal 
antibodies) and Alexa 647-conjugated streptavidin. In some samples, cells were 
fixed with 4% PFA in PBS (10 min, room temperature) and permeabilized with 
0.1% (w/v) saponin. Cells were processed for two-colour flow cytometry using a 
MACs Quant Analyzer 10 (Miltenyi Biotec). 

Plasmid transfections. 293T gene-edited cells were transfected with the following 
genes that were derived from a WNV infectious cDNA clone® and then cloned 
into a pHLsec backbone (gift from D. Fremont): V5-C-prM-E, prM, prM-Flag 
(3 x Flag), E, prM-E, prM-E-NS1, E-NS1, NS1, NS1-NS2A-Flag (includes 
full-length NS1 and 231 amino acids of NS2A fused to a C-terminal 3 x Flag), 
and 2K-NS4B-haemagglutinin tag (HA). These plasmids were obtained from 
colleagues (e.g., 2K-NS4B-HA™, gift from A. Garcia-Sastre) or in some cases 
engineered to contain either native WNV signal sequences (C-prM, 18 amino 
acids beyond the C terminus of C; prM-E, 17 C-terminal amino acids of prM; 
E-NS1, 24 C-terminal amino acids of E) or the signal sequence of mouse K® class 
I MHC (N-terminal 21 amino acids). Plasmids were transfected into gene-edited 
293T cells using FugeneHD reagent (Promega) according to the manufacturer's 
instructions. Supernatants containing prM-E subviral particles (SVPs) were col- 
lected 24h after transfection, filtered through a 0.2-|1m filter, and stored aliquotted 
at —80°C. For the capture ELISA, Nunc MaxiSorp polystyrene 96-well plates were 
coated overnight at 4°C with mouse E60 monoclonal antibodies™ (5 ugml~') ina 
pH 9.3 carbonate buffer. Plates were washed three times in enzyme-linked immu- 
nosorbent assay (ELISA) wash buffer (PBS with 0.02% Tween 20) and blocked for 
lh at 37°C with ELISA block buffer (PBS, 2% bovine serum albumin, and 0.02% 
Tween 20). Supernatants from prM-E plasmid transfected cells were captured on 
plates coated with E60 for 90 min at room temperature. Subsequently, plates were 
rinsed five times in wash buffer and then incubated with humanized anti- WNV 
E16 (1jgml-! in block buffer) for 1h at room temperature. Plates were washed 
five times and then incubated with pre-absorbed biotinylated goat anti-human IgG 
antibody (1 1g ml“; Jackson Laboratories) for 1h at room temperature in blocking 
buffer. Plates were washed again five times and then sequentially incubated with 
2,.gml! of horseradish peroxidase-conjugated streptavidin (Vector Laboratories) 
and tetramethylbenzidine substrate (Dako). The reaction was stopped with the 
addition of 2 N H2SO, to the medium, and emission (450 nm) was read using an 
iMark microplate reader (Bio-Rad). 

Western blotting. For virus infected samples, cells were infected with WNV (MOI 
200-1,000, 24h), JEV (MOI 150, 45h), CHIKV (MOI5, 12h), SINV (MOIS5, 16h), 
RVFV (MOI 2.5, 16h), or HCV (MOLS5, 48 or 72h). Cells (10°) were lysed directly 
in 30,11 RIPA buffer (Cell Signaling) with 0.1% SDS and a cocktail of protease inhib- 
itors (Sigma-Aldrich). Samples were prepared in LDS buffer (Life Technologies) 
under non-reducing or reducing (dithiothreitol) conditions. After heating (70°C, 
10 min), samples were electrophoresed using 7% Tris-Acetate or 4-12%, 10% or 
12% Bis-Tris gels (Life Technologies) and proteins were transferred to PVDF mem- 
branes using an iBlot2 Dry Blotting System (Life Technologies). Membranes were 
blocked with 5% non-fat dry powdered milk and probed with antibodies against 
SPCS1 (11847=1-AP, Proteintech), SPCS2 (14872-1-AP, Proteintech), SPCS3 
(ab91222; Abcam), SEC11A (14753-1-AP, Proteintech), SEC11C (HPA026816, 
Sigma) and SEC61B (ab15576, Abcam). For studies with prM-E, prM, E, NS1, 
NS1-2A-Flag, or 2K-NS4B-Flag-transfected or virus-infected cells, membranes 
were probed with anti-E (human E16; mouse CHK-48*°; mouse anti-JEV, oligo- 
clonal pool), anti-NS1 (mouse 8-NS1), anti-NS3 (W1018-54, USBio), anti-NS4B 
(rabbit polyclonal antibody”, gift from W.I. Lipkin) anti-prM (human CR4293"° or 
rabbit WNV-M (IMG-5099A, IMGENEX)), anti-Flag (F1804, Sigma), and the rel- 
evant secondary antibodies. For validation of the secretome experiments, superna- 
tants were electrophoresed and PVDF membranes were probed with anti-CXCL16 
(ab101404, Abcam), anti-SFRP1 (ab126613, Abcam), anti-RNASET2 (ab169655, 
Abcam), anti-LGALS3BP(ab81489, Abcam), anti-SLITL2 (ab173758, Abcam), anti- 
PEDF (ab157207, Abcam), anti- NPC2 (19888-1-AP, Proteintech), anti-CREG1 
(12220-1-AP, Proteintech), and the relevant secondary antibodies. Western blots 
were developed using SuperSignal West Pico Chemiluminescent Substrate or 
SuperSignal West Femto Maximum Sensitivity Substrate (Life Technologies). 


LETTER 


Metabolic labelling, pulse-chase, and immunoprecipitation experiments. Pulse- 
and pulse-chase experiments were performed as described previously"®. After 
starvation in methionine/cysteine-free DMEM for 30 min, 293T cells were labelled 
metabolically with 300 or 500,1Ci ml"! [3°S]-methionine/cysteine (PerkinElmer 
Life Sciences) at 37°C for 3 or 40 min. Cells then were washed three times in 
PBS and immediately lysed or incubated in DMEM supplemented with non- 
radiolabelled cysteine (500 1g ml!) and methionine (100,.g ml). Cells lysis was 
performed in 400 il of 50 mM Tris-HCl, pH 7.4, 150 mM NaCl, 1mM PMSF, 1 mM 
EDTA, 5ygml~! aprotinin, 5,.g ml“! leupeptin, 1% Triton X-100, 1% sodium 
deoxycholate, 0.1% SDS. After preclearing with an irrelevant human monoclonal 
antibody protein A-agarose (Thermo Fisher Scientific) complex, lysates were 
incubated for 1h at 4°C with humanized monoclonal E16 and E60 monoclonal 
antibodies or anti-Flag and then with protein A-agarose for 2h. The immunopre- 
cipitates were washed seven times in 50 mM Tris-HCl, pH.7.4, 150mm NaCl, 1 mM 
PMSF, 1mM EDTA, 5p.g ml! aprotinin, 51g ml! leupeptin, 1% Triton X-100, 
1% sodium deoxycholate, and 0.1% SDS, and then analysed by SDS-PAGE under 
reducing conditions, followed by fluorography. Some immunoprecipitates were 
incubated with 20 mU endoglycosidase H or PNGase F (New England BioLabs) 
for 1h at 37°C before SDS-PAGE and fluorography. 
293T cell viability assay. A Vybrant MTT cell viability assay (Life Technologies) 
was used according to the manufacturer's instructions. Briefly, 10 jl of 12mM 
MTT (4,5-dimethylthiazol-2-yl-2-5-diphenyltetrazolium bromide) was added 
to 10° 293T cells (different gene-edited lines; with or without WNV infection) in 
10011 phenol-red free medium. Cells were incubated for 4h at 37°C, at which time 
medium was removed and formazan crystals solubilized in 10011 of DMSO were 
added for 10 mimat.37 °C. Liquid was analysed for absorbance at 540 nm using a 
Synergy Hl Hybrid Plate Reader (Biotek). 
Flow and mass cytometry analysis of Jurkat T cells. The antibodies and 
conjugates used are listed in Supplementary Table 6. For flow cytometry studies, 
wild-type and SPCS1 gene-edited Jurkat T cells were incubated with fluoro- 
phore-conjugated monoclonal antibodies for 30 min at 4°C and then washed three 
times in PBS containing 5% FBS. Cells were immediately processed on an LSRII 
flow cytometer and data were analysed using FlowJo 10.0.7. For mass cytometry 
studies, wild-type and SPCS1 gene-edited Jurkat T cells were labelled with mon- 
oclonal antibodies conjugated with transition element isotopes and analysed on 
a CyTOF 2 mass cytometer (Fluidigm DVS Sciences). Data were analysed using 
Cytobank (http://wustl.cytobank.org) and FlowJo 10.0.7. 
Secretome analysis of SPCS1~/~ 293T cells. Wild-type and SPCS1~'~ 293T cells 
were cultured in poly-p-lysine treated flasks in FreeStyle 293 Expression Medium 
(ThermoFisher) supplemented with 10% FBS. At 90% confluence, cells were 
washed four times with pre-warmed PBS, then twice with pre-warmed FreeStyle 
293 Expression Medium, and maintained in FreeStyle 293 Expression Medium 
without FBS for 48h. Supernatants were collected and clarified by centrifugation 
at 1,000g for 5 min, and then 10,000g for 20 min at 4°C. Samples were concen- 
trated with Amicon Ultra-15 Centrifugal Filter Units (Millipore) at 5,000g for 1h 
in the presence of 1 x protease inhibitors ($8830, Sigma). The concentrates were 
collected and stored at —80°C. After thawing on ice, the samples were exchanged 
twice in digestion buffer (Tris, 0.1 M, pH 8.5 containing 8 M urea) by centrifugation 
(~4,000g, 2h) in Amicon Ultracel 3K units to a volume of ~100 11. The solubilized 
samples were reduced with 2mM DTT (ThermoScientific) for 30 min at 37°C 
followed by alkylation at room temperature for 30 min with 7 mM iodoacetamide 
(Sigma) in the dark. The alkylated samples were treated with 7mM DTT for 15 min 
at room temperature. After dilution, the samples were digested with LysC (11g) 
(Sigma) overnight at 37°C with agitation (ThermoMixer). After dilution of the 
samples to 1.5 M urea with Tris buffer, trypsin was added (51g) (Sigma) was added 
and the incubation was continued overnight at 37 °C with mixing. The digested 
samples were acidified with to a concentration of 1% tri-flouro acetic acid (TFA). 
The peptides were desalted with a SepPak (50 mg) with 0.1%TFA/70% acetoni- 
trile in an elution volume (2 ml). The lyophilized peptides were quantified with 
a fluorescent assay (Thermo Fisher) and 21g was labelled with TMT-6 reagents 
according to the vendor. The labelled peptides were desalted and the samples were 
transferred to PCR tubes (0.5 ml) and positioned in 96-well holders for robotic 
solid phase extraction (SPE). Each digest was extracted sequentially with one C4 
tip (Glygen BIOMEK NT3C04) and one porous graphite carbon micro-tip (Glygen 
BIOMEK NT3CAR) with the following auto-pipetting steps: (i) wet tips with AcN/ 
FA (60%/1%) (10 x 2511); (ii) equilibrate tips with AcN/FA (1%/1%) (10 x 2511); 
(iii) extract peptides with repetitive aspirations of the digest (50 x 2511); (iv) wash 
loaded tips with AcN/FA (1%/1%) (10 x 25 11); and (v) elute peptides with AcN/ 
FA (60%/1%) (5 x 65,11). The SPE eluents were pooled and dried in a SpeedVac 
centrifuge and transferred to an autosampler vial for LC-MS analysis. 

The remainder of the peptides were dissolved in the binding buffer (100 mM 
Tris, pH 7.8 containing NaCl (0.5 M), MnCl, (1mM) and CaCl (1mM). The dried 
lectins (Con-A and WGA) Sigma were dissolved in binding buffer (4mgml~!). 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


The rCA120 (10mgml'), Con-A and WGA were added to the peptide solution 
(361 and 1011, respectively). After incubation at room temperature, the mixture 
was transferred to a YM-10 Microcon filter unit. After centrifugation (14,000g) for 
10 min and washing with binding buffer (10011), the filter unit was transferred to 
another tube. The peptides were released with the addition of PNGase (10 units) in 
100 ul of ammonium bicarbonate buffer (50 mM) after incubation at 37 °C for 1.5h. 
The enzyme addition and incubation was repeated and the peptides recovered 
with one wash of PNGase buffer. The peptides were acidified to 5% formic acid 
and desalted, labelled with TMT-6, and prepared for LC-MS as described above 
for the total pool of peptides. 

LC-MS analysis. LC-ESI/MS/MS analysis was conducted with a Q-Exactive Plus 
mass spectrometer coupled to an EASY-nanoLC 1000 system (Thermo-Fisher). 
For each Hp-RP fraction, 2 11 of sample was loaded onto a 751m i.d. x 25cm 
Acclaim PepMap 100 RP column (Thermo-Fisher Scientific). Peptide separations 
were started with 95% mobile phase A (0.1% FA) for 5 min and increased to 30% 
B (100% ACN, 0.1% FA) over 180 min, followed by a 25-min gradient to 45% 
B, a 5-min gradient to 95% B and wash at 90% B for 7 min, with a flow rate of 
300nl min~!. Full-scan mass spectra were acquired by the Orbitrap mass analyser 
in the mass-to-charge ratio (m/z) of 375-1,400 and with a mass resolving power 
set to 70,000. Fifteen data-dependent high-energy collisional dissociations were 
performed with a mass resolving power set to 35,000, a fixed first m/z of 100, an 
isolation width of 0.7 m/z, and the normalized collision energy (NCE) setting of 
32. The maximum injection time was 50 ms for parent-ion analysis and 105 ms 
for product-ion analysis. Target ions already selected for MS/MS were excluded 
dynamically for 30 s. An automatic gain control target value of 3 x 10° ions was 
used for full MS scans and 10° ions for MS/MS scans. Peptide ions with charge 
states of one or greater than six were excluded from MS/MS interrogation. 
Protein identification and quantification with TMT. All raw data were processed 
using Proteome Discoverer (version 2.1.0.81, Thermo-Fischer Scientific). MS/MS 
spectra were searched with SequestHT engine against the human UniRef database 
(69,021 entries; version 2014_05), assuming the digestion enzyme was trypsin 
with a maximum of 2 missed cleavage allowed. The searches were performed with 
a fragment ion mass tolerance of 0.02 Da and a parent ion tolerance of 20 ppm. 
Deamidation of asparagine and glutamine, acetylation and TMT 6-plex derivat- 
ization of N termini and oxidation of methionine were specified in Proteome 
Discoverer as variable modifications. lodoacetamide derivatization of cysteine 
and TMT 6-plex derivatization of lysine were specified as fixed modifications. 
Peptide spectral matches (PSM) were validated using percolator based on q-values 
at a 1% FDR”. Peptides were filtered to 1% FDR and grouped into proteins at 1% 
FDRas specified in Proteome Discoverer. The intensities of TMT reporter ions 
were determined with Proteome Discoverer at a mass tolerance of 0.01 Da and 
used for peptide quantifications. The median values of peptide intensities that can 
be assigned to a same protein was used to represent protein intensities. Peptide 
identifications that can be assigned to more than one protein were removed from 
protein quantification 

Proteomic Data Analysis. Normalization in protein ratios was applied in that 
the median ratios are log, 0. Data analysis was performed with the free software 
environment for statistical computing and graphics, R-(http://www.R-project. 
org). Gene ontology analysis was carried out using the Database for Annotation, 
Visualization and Integrated Discovery, (DAVID)**“”. Data from duplicated LC/ 
MS/MS analysis were first averaged and protein abundance ratios were log - 
transformed before statistical analysis. A one-way ANOVA with Benjamini- 
Hochberg correction was performed to assess the statistical significance in protein 
abundance changes between wild type and SPCS1~‘~ cells. 

Statistical analysis. No statistical methods were used to predetermine sample size. 
The experiments were not randomized and the investigators were not blinded to 
allocation during experiments and outcome assessment. Statistical significance 
was assigned when»P values were <0.05 using GraphPad Prism Version 5.04. Viral 


antigen staining after expression of sgRNA was analysed using a one-way ANOVA 
adjusting for repeated measures with a Dunnett's multiple comparison test or with 
a Mann-Whitney test depending on the number of comparison groups. Analysis 
of levels of E protein in the supernatant from CRISPR-Cas9 gene edited cells was 
analysed by a one-way ANOVA. Analysis of siRNA in insect and human cells was 
performed using a Student's t-test or ANOVA. 


28. Rose, P. P. et a/. Natural resistance-associated macrophage protein is a cellular 
receptor for sindbis virus in both insect and mammalian hosts. Cel! Host 
Microbe 10, 97-104 (2011). 

29. Brien, J. D., Lazear, H. M. & Diamond, M. S. Propagation, quantification, 
detection, and storage of West Nile virus. Curr Protoc. Microbiol. 31, 15D 13 
11-15D 13 18 (2013). 

30. Jiang, J. & Luo, G. Cell culture-adaptive mutations promote viral protein-protein 
interactions and morphogenesis of infectious hepatitis'C virus. J) Virol. 86, 
8987-8997 (2012). 

31. Sabo, M. C. et al. Neutralizing monoclonal antibodies against Hepatitis C Virus 
E2 protein bind discontinuous epitopes and inhibit infection at a 
postattachment step. J. Virol. 85, 7005-7019 (2011). 

32. Sanjana, N. E., Shalem, O. & Zhang, Felmproved vectors and genome-wide 
libraries for CRISPR screening. Nat. Meth. 11, 783-784 (2014). 

33. Oliphant, T. et a/. Development of a humanized monoclonal antibody with 
therapeutic potential against West Nile virus. Nat. Med. 11, 522-530 (2005). 

34. Oliphant, T. et a/. Antibody recognition and neutralization determinants on 
domains | and II of West Nile Virus envelope protein. J. Virol. 80, 12149-12159 
(2006). 

35. Pal, P. et al. Development of a highly protective combination monoclonal 
antibody therapy against Chikungunya virus. PLoS Pathog. 9, €1003312 
(2013). 

36. Fuchs, A.,Pinto, A. K., Schwaeble, W. J. & Diamond, M. S. The lectin pathway of 
complement activation contributes to protection from West Nile virus infection. 
Virology 412, 101-109 (2011). 

37. Boutros, M. etal. Genome-wide RNAi analysis of growth and viability in 
Drosophila cells. Science 303, 832-835 (2004). 

38. Hackett, B. A. et al, RNASEK is required for internalization of diverse 
acid-dependent viruses. Proc. Natl Acad. Sci. USA 112, 7797-7802 (2015). 

39. Lin, T. Y. et al. A novel approach for the rapid mutagenesis and directed 
evolution of the structural genes of west nile virus. J. Virol. 86, 3501-3512 
(2012). 

40. Pierson, T. C. et al. A rapid and quantitative assay for measuring antibody- 
mediated neutralization of West Nile virus infection. Virology 346, 53-65 
(2006). 

41. Mishin, V. P, Cominelli, F. & Yamshchikov, V. F. A ‘minimal’ approach in design 

of flavivirus infectious DNA. Virus Res. 81, 113-123 (2001). 

42. Chung, K. M. et al. Antibodies against West Nile Virus nonstructural protein 

S1 prevent lethal infection through Fc gamma receptor-dependent and 

-independent mechanisms. J. Virol. 80, 1340-1351 (2006). 

43. Beasley, D. W. et a/. Envelope protein glycosylation status influences mouse 

neuroinvasion phenotype of genetic lineage 1 West Nile virus strains. J. Virol. 

79, 8339-8347 (2005). 

44. Muftoz-Jordan, J. L. et al. Inhibition of alpha/beta interferon signaling by the 

S4B protein of flaviviruses. J. Virol. 79, 8004-8013 (2005). 

45. Medigeshi, G. R. et al. West Nile virus infection activates the unfolded protein 

response, leading to CHOP induction and apoptosis. J. Virol. 81, 10849-10860 

(2007). 

46. Miner, J. J. et al. Cytoplasmic domain of P-selectin glycoprotein ligand-1 

acilitates dimerization and export from the endoplasmic reticulum. J. Biol. 
Chem. 286, 9577-9586 (2011). 

47. Kall, L., Canterbury, J. D., Weston, J., Noble, W. S. & MacCoss, M. J. Semi- 
supervised learning for peptide identification from shotgun proteomics 
datasets. Nat. Methods 4, 923-925 (2007). 

48. Huang, W., Sherman, B. T. & Lempicki, R. A. Systematic and integrative analysis 
of large gene lists using DAVID bioinformatics resources. Nat. Protocols 4, 
44-57 (2009). 

49. Huang, W., Sherman, B. T. & Lempicki, R. A. Bioinformatics enrichment tools: 
paths toward the comprehensive functional analysis of large gene lists. Nucleic 
Acids Res. 37, 1-13 (2009). 


© 2016 Macmillan Publishers Limited. All rights reserved 


Pooled lentivirus sgRNA library 


122,411 sgRNA targeting 19,050 human genes 


On average: 6 sgRNA per gene 


S&oeoee e&e6 
Soto &€ os 


Transduce into 293T-Cas9 cells, MOI of 0.3 
(duplicate replicates, two biological experiments) 


a 


Infect cells (WNV) 
Uninfected cells 


2 weeks 


Surviving cells 


Comparison of sgRNA abundance 
using deep sequencing 


2. 


120: 
Ce | | 
3 
‘= 80 
£ 60 
i 
g@ 40 
J 
* 20 
> 
2 
& & oe ? 
s$ ge EE € ¢ 
ER-translocation 
SgSTT3A SgSTT3A 
f ~— 
| | f 
| 
| 
| | oses | 933 
| ly Yn 
| 
I~ 


| Pm, 
| @ 


LETTER 


MTT assay cell viability 
24 hours after WNV infection 


g 
12 
x 
2? 
z= 1 
cat 2 00 Hi Control 
. 5 75 Ml STTA3 
Se Ml SEC61B 
s5 (50 m— SEC63 
Lo M@™ SPCSs1 
SN 95 
se & fl SPCS3 
E 0 
2 > > @ & NX & 
& ae ee & & & 
hr a a 
293 CRISPR-Cas9 cells 
Cc od = on 
0 
£2 £3 8s 
a ce cw 
9 od oD oD 
Os Oo O 4a 
j = 7 
SEC61B cm SPCS1 ‘@n= » SPCS3 
SON p-actin BD actin Oe s-actin 
i sINV 
( CHikv 
M™ LACV 
 vsv 
a wv e v 
S$e°r(GF™ 
Oligosaccharyl 'ERAD pathway pathway 5 3 5 8 
transferase 2232832 
ere ees 
f RES BSB 
__sgSEC61B_ ss sgSEC61B 7 8S238 
— , & G&S 2 cA a 
| | ” PPR es Fe 
j | > 
i tI-FLAG 
| } 1 | ns 16 - r=) e . 
| A lA 
| | @ i) 293T cells, WNV 


Empty vector 


Empty vector STTSA-FLAG SEC61B-FLAG 
: sgSPCS1_ sgSPCS1 ___ sgSPCS3___ sgSPCS3 
| | i } | | 
aap > | ; | 
i | oss | one | 107 
| Al VN i g 
ip 6 < | & ( 
Empty vector SPCS1-FLAG Empty vector SPCS3-FLAG 


Extended Data Figure 1 | See next page for caption. 


© 2016 Macmillan Publishers Limited. All rights reserved 


Relative infection 
of GFP* cells 


LETTER 


Extended Data Figure 1 | Results of CRISPR-Cas9 screen. a, Scheme 
of gene-editing screen. b, Analysis of cell viability of gene-edited cells. 
WNV- infected CRISPR-Ca9 edited bulk cells were evaluated for cell 
viability using a metabolic MTT assay 24h after infection. The results 
were pooled from several independent experiments performed in 
duplicate and data were compared to cells edited with a control sgRNA. 
None of the differences were statistically significant compared to the 
control. c, Western blotting confirms the efficiency of gene editing of 
SEC61B, SPCS1, and SPCS3. 3-actin is included as a loading control. 

d, Effect of gene editing on infection by other RNA viruses. sgRNA- 
edited bulk selected cell populations were infected with alphaviruses 
(SINV or CHIKV), a bunyavirus (LACV) or a rhabdovirus (VSV). Cells 
were analysed for intracellular viral antigen staining by flow cytometry 
using virus-specific monoclonal antibodies. The data are representative 
of two independent experiments and are expressed as relative infection 
(viral antigen expression) compared to the sgRNA control. d-f, Trans- 
complementation of sgRNA gene-edited cells with Flag-tagged genes. 


d, Individual sgRNA bulk gene-edited cell lines were trans-complemented 
with cDNA expressing C-terminal Flag-tagged versions of their respective 
genes and GFP or an empty vector control and GFP. Transfected cells 
were analysed by flow cytometry for expression of the Flag-tag in the 
GFP* cells. The data are representative of two independent experiments. 
e, Western blotting of SPCS1 and SPCS3 trans-complemented genes after 
incubation with an anti-Flag antibody. f, Individual sgRNA cell lines 

were trans-complemented with cDNA expressing C-terminal Flag-tagged 
versions of their respective genes and GFP or an empty vector control 

and GFP. Transfected cells were infected with WNV (MOI 5) and 12h 
later cells were stained for intracellular E antigen and processed by flow 
cytometry. The data are representative of three independent experiments 
performed in triplicate and reflect the percentage of WNV-infected cells 
in the fraction that expressed GFP. The indicated comparisons were 
statistically significant (****P < 0.0001), as determined by the Mann- 
Whitney test. For gel source data, see Supplementary Fig. 1. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


siRNA of SPCS genes in U2OS cells 
70: 
60 
50 


Control 1 siRNA 
Control 2 siRNA 
SPCS siRNA 


SPCS1 
Tubulin 


SPCS2 


Percent cells infected 


Tubulin 


SPCS3 


B WNV-Kunjin 
Ml DENV-2 


Tubulin 


Extended Data Figure 2 | Gene silencing of SPCS genes in human The data are pooled from three independent experiments assayed in 
U20S cells. Human U20S cells were transfected with either control or quadruplicate. No reduction in infection of CHIKV or SINV was observed 
SPCS1, SPCS2, or SPCS3 siRNAs and infected with WNV (Kunjin strain) after SPCS gene silencing (data not shown). Right, western blotting of 

or DENV (MOI 1) for 18h. Left, the percentage of infected cells was SPCS proteins in gene-silenced U20S cells. Representative results are 
determined by automated fluorescence microscopy. The data are expressed — shown and tubulimis included as a loading control. For gel source data, 

as the mean normalized value + s.d. **P < 0.01; ***P < 0.0001 compared see Supplementary Fig. 1. 

to control siRNA by ANOVA with a multiple comparison correction. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 
SINV 2 


SINV, 16h c 


Mr 


Log, FFU/ml 


40 - -E2 
30 - 
0 
0 12 24 36 48 60 72 84 96108 25 - 
Hours after infection 
-* Control 
- SPCS1-- 


293T SPCS1-/- clone 1 
sgRNA PAM 
ATTTATCTACGGGT ACGT GGCT GAAC. 


Huh7.5 SPCS1-/- clone 7 
sgRNA PAM 


ATTTATCTACGGGTACGTGG- ------------ T GGACT GT CTATATAGTTATGGCC allele 1 
ATTTATCTACGGGTACGTGG- ------------ GT GGACTGTCTATATAGTTATGGCC allele 2 
e anti-E 
Bulk Clone #7 QQ JEV 
Mr Bennett + + 
15- 
10- 


Mr 

- SPCS1 - amt 
120 - 

85 - 


Control 


sgControl 
sgSPCS1 
SPCS1-/- 


Extended Data Figure 3 | Viral infection in SPCS1~'~ cells. 

a-c, Alphaviruses replicate and are processed efficiently in 293T cells 

in the absence of expression of SPCS1. a, SINV infection in control and 
SPCS1~'~ clonal cells. Cells were infected (MOI 0.01) and supernatants 
were collected and analysed by FFA. The results are the average of 

two independent experiments performed in triplicate. b, Control and 
SPCS1~/~ gene-edited 293T cells were infected with SINV. At the indicated 
time, lysates were prepared, electrophoresed and western blotted with 
anti-SINV E2.ascites fluid (ATCC VR-1248AB). c, Control or SPCS1~!~ 
293T cells were infected.with CHIKV (MOI 5). After 8h, cells were 
labelled for 30 min with [*°S]cysteine/methionine. Excess cold cysteine/ 
methionine was,added for indicated chase times (0, 1 or 4h). An 
uninfected. control established the specificity of the immunoprecipitation. 
After *°S labelling, lysates were prepared and immunoprecipitated with 
anti-E2 monoclonal antibodies (CHK-48). Immunoprecipitates were 

left untreated (blank) or treated with Endo H (E) or PNGase F (P) 

for 1h at 37°C before SDS-PAGE and fluorography. d, Sequencing of 


Uninfected 


~  ATTTAT CTACGGGT ACGT GGCT GAACAGT T CGGGT GGACT GT CTATATAGT TAT GGCC wild type 


ee eee eee TGTCTATATAGTTATGGCC allele 1 
ATTTATCTACGGGT ACGT GGCT GAAC- - - - - - - GT GGACT GT CTATATAGTTATGGCC allele 2 


- + ATTTAT CTACGGGT ACGT GGCT GAACAGT T CGGGT GGACT GT CTATATAGT TAT GGCC wild type 


Control 


Mr y Oh th 4h Oh th 4h 
f 


—E3-E2-6K-E1 


— p62 (E3-E2) 


WNV HCV 


Logi, FFU/mL 
oe 


Log4g FFU/ml 
> 


0 24 48 #72 96 0123 45 67 
Hours after infection Days after infection 
-= Control -= sgControl 
-* SPCS1-/- -e sgSPCS1 
ZIKV JEV 
| 
£ 
= 
Le 
ira 
e 
Da 
fe} 
al 
0 24 48 72 96 120 144 0 24 48 72 
Hours after infection Hours after infection 
—* Control -* Control 
-e SPCS1-/- -e SPCS1-/- 


SPCS1 alleles in gene-edited 293T and Huh7 cell clones after puromycin 
selection and limiting dilution cloning. The sgRNA targeting site and 

the ‘PAM’ sequences are highlighted at the top of the wild-type gene, 

and the sequence of edited alleles are indicated. e, Western blotting of 
bulk-selected or clonal (clone 7) Huh7.5 cells (control and SPCS1 sgRNA 
selected) for expression of SPCS1 (~12 kDa). f, WNV, HCV, ZIKV, 

and JEV (Bennett strain) infection in control and SPCS1-deficient Huh7.5 
cells. Cells were infected at an MOI of 0.01 (WNV, ZIKV, JEV) or 1 (HCV) 
and supernatants were collected and analysed by FFA. The results are the 
average of two independent experiments performed in triplicate. 

g, Control or SPCS1~/~ Huh7.5 cells were infected at an MOI of 150 for 
45h with a pathogenic JEV isolate (Bennett strain). Lysates were blotted 
with an anti-JEV E monoclonal antibody. Higher molecular mass bands 
(E™ and E™¢) that reacted specifically with the anti-E monoclonal 
antibody are indicated. One representative experiment of two is shown 
and loading controls (3-actin) are included. For gel source data, 

see Supplementary Fig. 1. 


© 2016 Macmillan Publishers Limited. All rights reserved 


sgSEC11A #2 
sgSEC11A #3 
sgSEC11A #4 
sgSEC11A #5 
sgSEC11A #6 
sgSEC11A #7 


< 
oo 
So 
BE 
£ Oo 
ow 
on 
DD 
oo 


sgControl 
sgSEC11C #1 
sgSEC11C #2 


Mr 
30- 


“SECA. 


15- 


ee 


293T cells, Flaviviruses 


Log,) FFU/ml 
ofr NO EAD N 


WNV YFV 


@® sgControl 

lM sgSEC11A #1 

Il sgSEC11A #5 

MM sgSECIIC #1 

IM sgSEC11C #8 
Extended Data Figure 4 | Gene editing of SEC11A and SEC11C do not 
substantively affect infection of several enveloped viruses. Top, 293T 
cells were treated with the indicated sgRNA and isolated in bulk after 
puromycin drug selection. Western blotting confirmed gene editing of 
SEC11A (left, 20 kDa) or SEC11C (middle, 22 kDa). No difference in levels 
or migration pattern of WNV E was observed in SEC11A or SEC11C 
gene-edited cells (right) after WNV infection at an MOI of 200 for 24h. 


sgSEC11C #3 


Log,,) FFU/mI 


So2NOR 


LETTER 


sgSEC11C #4 
sgSEC11C #5 
sgSEC11C #6 
sgSEC11C #7 
sgSEC11C #8 
sgSEC11C #9 
sgControl 
sgSEC11A 
sgSEC11C 


- SEC11C 


i. 


(4 RD - 6-actin 


293T cells, Other viruses 


CN 


SINV CHIKV VSV_ RVFV 


® sgControl 

MM sgSEC11A #1 

MM sgSEC11A45 

MM sgSEC11C #1 

MM sgSEC11C #3 
Spaces between the western blots indicate cropping to remove lanes that 
were not relevant to this experiment. Bottom, control or gene-edited 
293T cells were infected with viruses and supernatants were harvested 
after infection for titration. Left, WNV (MOI 0.01, 72h) or YFV (MOI 
1, 72h); right, SINV (MOI 0.01, 72h), CHIKV (MOI 0.01, 36h), VSV 
(MOI 0.01, 36h), or RVFV (MOT 1, 72h). Results are representative of two 
independent experiments. For gel source data, see Supplementary Fig. 1. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a WNV, WT Replicon b WNV, GVD Replicon 
g 2000: 5 2000: 
7) ® 1500: 
@ 1500 @ 
= = 1000 
3 5 500 
it 1000: i 300: 
- % 200 
© 500 ° 
ue a er ee rs 100: 
oO 0 1) 0 
0 24 48 72 0 24 48 72 
Hours after transfection Hours after transfection 
-® sgControl -= sgControl 
-~- sgSTT3A —- sgSTT3A 
~~ sgSPCS1 alas sgSPCS1 
-® sgSPCS3 ~- sgSPCS3 


-e sgControl, GVD replicon 
Extended Data Figure 5 | Effect of sgRNA on translation and replication for GFP expression by flow cytometry. After transfection with the wild- 


of wild-type and NS5 GVD polymerase mutant WNV replicons. type replicon, WNV replication was lower in STT3A gene-edited cells, as 
A cDNA launched WNV replicon (a, wild-type; b, GVD polymerase ‘dead’ —_ determined by ANOVA with a multiple comparisons correction (P < 0.05 
mutant) with a minimal CMV promoter (GFP-NS1-NS2A-NS2B-NS3- at 48 and 72h). Data are the average of three independent experiments. 
NS4A-NS4B-NSS) was transfected into the indicated gene-edited 293T Note, the GVD replicon data with sgRNA control (translation only) are 
cells. At 48 and 72h after transfection, cells were collected and analysed provided for comparison in a. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a b WNV + + + - 
Normal exposure Over exposure wr Oh +42h  +24h  +24h 
WNV + + + - WNV + + + - 
wr 6h 42h 424h — 424h +6h +12h «424h 42h 
£4HG|8 48/8 g EHSl|SH B/558 
Pele ElE El 2 8/28 BlE88 —_ 
3) ee ° § 46184 BI855 ° 2 
a ced 2 PP RSe B22 eB z z 
wweweeeroereee@ fxctin c. g 
ao a 
Cc 
anti-NS4B anti-HA e wo re — A f 
a oO Rog . 
WNV 2K-NS4B at S& anti-WNV NS3 
infection transfection Plasmid YP YW Plasmid < 
os SR ST Mr 40 - 
Mr | 
30- 
25- - NS4B - NS2A-FLAG 
E233 B 2 
ER ED 5 6 SOR - p-actin ae we - p-2ctin 
90 590 ro) a a = = i 
SES 5 BSESES . 2.3 Ips 
- B-acti EHER ED  —-. 
reer -f-actin ME «actin 898g e8sge § 3 5 8 
GG GB SoG °° 6S 
anti-NS1 anti-FLAG 


Extended Data Figure 6 | Processing of WNV proteins in SPCS1 and 
SPCS3 gene-edited 293T cells. a, Normal (left) and over-exposed (right) 
western blot in SPCS1 and SPCS3 bulk gene-edited 293T cells. The over- 
exposure is shown to highlight the accumulation of high molecular mass 
bands that react with anti-E protein antibody. Control, SPCS1 and SPCS3 
gene-edited 293T cells were infected with WNV or mock-infected for the 
indicated times. Lysates were western blotted with an anti-E (human E16) 
monoclonal antibodies. Under these electrophoresis conditions, natively 
processed E protein migrates at ~50 to 55 kDa. Higher molecular mass 
bands (E™°4 (probably prM-E) and Ebi (probably prM-E=NS1)) that react 
specifically with the E monoclonal antibody are present only in SPCS1 
and SPCS3 gene-edited 293T cells. The data are representative of two 
independent experiments and a loading control (8-actin) is shown. 

b, Western blot of SPCS1 and SPCS3 bulk gene-edited 293T cells. Control, 
SPCS1, and SPCS3 gene-edited 293T cells were infected with WNV or 
mock-infected for the indicated times. Lysates were western blotted with 
an anti-prM human monoclonal antibody (CR4293) that recognizes a 
shared epitope on prM and E. Higher molecular mass bands (prM-E"") 
probably represent uncleaved polyprotein forms and are present only in 
SPCS1 and SPCS3 gene-edited 293T cells. The data are representative 
of two independent experiments and a loading control (3-actin) is 
shown. c, Control or SPCS1~/~ cells were infected with WNV or left 


unmanipulated (—) and 24 h later cell lysates were generated and probed 
with a polyclonal antibody against NS4B. The results are representative of 
two independent experiments and loading controls (8-actin) are shown. 
d, Control or SPCS1~'~ cells were transfected with a 2K-NS4B-HA 
plasmid. One day later, lysates were probed with an anti-HA antibody. 
Results are representative of two independent experiments and loading 
controls (3-actin) are shown. Cleaved (NS4B) and uncleaved (2K—NS4B) 
bands are indicated on the right of the gel. e, Control or SPCS1~/~ cells 
were transfected with NS1, NS1-NS2A-Flag, or control plasmids. One 
day later, lysates were probed with anti-NS1 (left) or anti-FLAG (right) 
antibodies. Cleavage of NS1-NS2A-Flag results in expression of the 
C-terminal Flag tag exclusively with the residual NS2A protein. The 
results are representative of three independent experiments and loading 
controls (3-actin) are shown. Note, expression of NS1-NS2A results in two 
forms of NS1 (NS1 and NS1’) owing to a ribosomal frameshift event that 
occurs at a heptanucleotide motif near the beginning of the NS2A gene. 

f, Control or SPCS1~/~ cells were infected with WNV or left uninfected 
and 24 h later cell lysates were generated and probed with a monoclonal 
antibody against NS3. The results are representative of two independent 
experiments and loading controls (8-actin) are shown. For gel source data, 
see Supplementary Fig. 1. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Normal exposure Over exposure 
HCV ++ ++# = = ++ + + -- 
+48h +72h +72h +48h +72h +72h 


Mr 


a a ee ge a = = = 

CHM OCH LH CH CH OR 
¥ O ¥ O ¥ O ¥ O ¥ O ¥ O 
Sa £ a Sa £ a ca ca 
[Hn 8H 8a Sn Boa on 
{S) {S) oO oO [S) oO 

io 28o o > 9S For) 
n n an n n on 


Extended Data Figure 7 | Western blotting for HCV E2 in control and 
SPCS1 gene-edited Huh7.5 cells. Control or SPCS1 gene-edited cells were 
infected with HCV (MOI 5; +) or left untreated (—) and 48 or 72h later 
cell lysates were generated and probed with a mouse monoclonal antibody 


against HCV E2 protein. The results are representative of two independent 
experiments and a normal and over-exposed blot are shown. For gel source 
data, see Supplementary Fig. 1. 


© 2016 Macmillan Publishers Limited. All rights reserved 


4 heE16, anti-E D cRa293, anti-prM-E 


3 
6 
cs) 
3 
= 


sgControl 
sgSTT3A 
sgSPCS1 
sgSPCS3 

sgSEC61B 

sgSEC61B 
sgSEC63 
+ sgSPcs1 
§ sgSPcs3 


se 
2 
CR 


_ sgSTT3A 


ie) 
o 
ao] 
= 
= 
+ 
+ 


Levels of E protein 
(SVPs) in supernatant E 


eee 


tek 


Optical Density (450 nm) 
eescesee ese 2s 


24h 
1NS1Neg Control Ml prM-E, sgSec61B 


er : = 
Hl prM-E, sgControl_ lll prM-E, sgSec63 pel: au g 
I pr-E, sgSTT3A ll priM-E, sgSPCS1 EQ R 5 

ll pri-E, sgSPCS3 (s) a a [3] 


Extended Data Figure 8 | Effect of sgRNA on WNV structural protein 
processing and production. a—c, The indicated gene-edited 293T cells 
were transfected with a plasmid encoding WNV prM-E and subjected 
to western blotting with hE16 (anti-E) (a) or CR4293 (anti-prM-E) (b). 
Note the shift of the prM-E bands to high molecular mass in bulk gene- 
edited cells with reduced expression of SPCS1 or SPCS3. The results are 
representative of three independent experiments and a loading control 
(8-actin) is shown. c, 293T cells expressing the indicated sgRNA were 
transfected with a plasmid encoding prM-E. After 24h, supernatants 
were collected and SVPs were quantified by capture ELISA. The results are 
the average of several independent experiments performed in triplicate. 
The asterisks indicate SVP levels in the supernatant that are statistically 
different compared to control cells (****P < 0.001; ANOVA with a 
multiple comparison correction). d, Control or SPCS1~'~ clonal 293T 
cells were transfected with a single C-prM-E plasmid containing an 


pectin RE pactn 


SPCS1-/- 


LETTER 


WNV replicon - - + + = 
C-prM-E plasmid + + + + cm 
Mr 115- 


80- -C-prM-E 


50- 

40- 

WB: anti-C (V5) -C-prM 
30- 


25- 
15- -C 


a -Pactin 


WNV replicon - - + + ©) - 


C-prM-E plasmid + + + # - 
80- -C-prM-E 
70- -prM-E 
WB: anti-E 
50- -E 
40- 
oT . 
F Serre ts 
oO 
oa 
a 


Control 
SPCS1-/- 
Control 
SPCS1-/- 
Control 
SPCS1-/- 


N-terminal V5-tag, prM/E leader 


N-terminal V5 tag fused to C (purple box) and native C-prM (green box) 
and prM-E (blue box) leader sequences. In some experiments, a cDNA 
launched WNV replicon was co-transfected to facilitate the cleavage of 

C from prM by the viral NS2B-NS3 protease. Lysates were prepared 24h 
later and probed with an anti-V5 (top) or anti-E (bottom) antibody. Note, 
two separate gels were run for blotting with ant-V5 (C) and anti-E. One 
representative experiment of two is shown and a loading control (8-actin) 
for the top (anti-V5) gel is included. e, Control and SPCS1~/~ 293T cells 
were transfected with E or NS1 with or without prM co-transfection. 

One day after transfection, cells were collected and lysates were western 
blotted with antibodies against E (left) or NS1 (right). Molecular mass 
markers and specific proteins are indicated to the left and right of each 
gel, respectively. The results are representative of three independent 
experiments. For gel source data, see Supplementary Fig. 1. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a HLA-A/BIC CD3 CD30 b CD34 CD71 CD45RA Cc 


O° 40? 10 40" 10° 0? 40> 40" 10° 10? fo? 108 0 10° «10? 10? 9 fo" fo? 10° 0 f0' ‘0 10° -SPCS1 
CD57 CD45 CD90 CD7 CD3 CD69 
\ ™ . 

\ |-8-actin 
10" 10 40? 10 40": 40 40? 10" 10° 40' 40? 10° t0¢ Ce ee ee ee ee So Pl 
HLA-E cp4sd ULBP1 CD90 cD107a CD38 5<@ 
SPCST = oO 

\WwT 
o (O. 
Oo 2 
on oD 
Hn 
Zt, A 
10° 107 10? 10¢ 10' 10710! 0% 


40' 107 10? 10* y4 o 10 10% 10° 0 10° 10% 10° 0 10° = 10? 10° 


Extended Data Figure 9 | Expression of immune system antigens on wild-type cells; blue, control cells; red, SPCS1 gene-edited cells. Results are 
the surface of SPCS1 gene-edited cells. a, b, Control and SPCS1 gene- representative of three independent experiments for flow cytometry and 
edited Jurkat cells were incubated with monoclonal antibodies against one run on a mass cytometer in triplicate. (c) Western blotting of bulk- 

the indicated cell surface antigens. After washing, cells were fixed with selected Jurkat cells (control and SPCS1 sgRNA selected) for expression of 
paraformaldehyde and then processed by flow cytometry (a) or mass SPCS1 (~12kDa). For gel source data, see Supplementary Fig. 1. 


cytometry (b). The histograms are as follows: black, isotype control in 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Mr Mr Mr 


50- 25- 
Sa J-Leatssep em -Nece 


O-(SRRE)-RNASET2 BREE nec 


< 30- — 
i St 
4 = | f5 
< oi ot 8 & 
= ge % 
2 5 8 53 
3 8g 8g 
1 7) 7) 
do 
r c Mr Mr 
iva] 50- 
50. 40- 
40. SOS - PEDF e@@ -SFRPI 
7 30- 
30- 
Oo. | 
loge Ratio re 3 4 
Be Ber 
53 5 8 
og og 
7) C7) 


Extended Data Figure 10 | Secretome analysis in control and SPCS1~/— that show reduced expression in SPCS1~/~ 293T cells. b, ¢, Western 

cell supernatants. a, Volcano plot from one-way ANOVA for secreted blotting of supernatants from control and SPCS1~'~ 293T cells. b, Proteins 
protein abundances between control and SPCS1 ~!~ 293T cells. The areas (LGALS3BP, RNASET2, and NPC2) identified as downregulated in 

of dots are proportional to the log, standard deviation of protein ratios. SPCS1~/~ 293T cells by mass spectrometry (see Supplementary Tables 
The vertical dashed lines delimit fold changes + 1.1 and the horizontal 4and 5). c, Proteins identified as having similar or possibly higher 

dashed line delimits P value < 0.05. The red dots show secreted proteins levels in supernatants of SPCS1~/~ 293T cells. For gel source data, see 
using the SP_PIR classification. Values < log 0 indicate secreted proteins Supplementary Fig. 1. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


doi:10.1038/nature18615 


Toremifene interacts with and destabilizes the Ebola 


virus glycoprotein 


Yuguang Zhao!*, Jingshan Ren!*, Karl Harlos!, Daniel M. Jones!, Antra Zeltina!, Thomas A. Bowden!, Sergi Padilla-Parra!, 


Elizabeth E. Fry! & David I. Stuart)? 


Ebola viruses (EBOVs) are responsible for repeated outbreaks 
of fatal infections, including the recent deadly epidemic in 
West Africa. There are currently no approved therapeutic drugs 
or vaccines for the disease. EBOV has a membrane envelope 
decorated by trimers of a glycoprotein (GP, cleaved by furin 
to form GP1 and GP2 subunits), which is solely responsible for 
host cell attachment, endosomal entry and membrane fusion!~’. 
GP is thus a primary target for the development of antiviral 
drugs. Here we report the first, to our knowledge, unliganded 
structure of EBOV GP, and high-resolution complexes of GP with 
the anticancer drug toremifene and the painkiller ibuprofen. 
The high-resolution apo structure gives a more complete and 
accurate picture of the molecule, and allows conformational 
changes introduced by antibody and receptor binding to be 
deciphered*!°. Unexpectedly, both toremifene and ibuprofen 
bind in a cavity between the attachment (GP1) and fusion (GP2) 
subunits at the entrance to a large tunnel that links with equivalent 
tunnels from the other monomers of the trimer at the three-fold 
axis. Protein-drug interactions with both GP1 and GP2 are 
predominately hydrophobic. Residues lining the binding site are 
highly conserved among filoviruses except Marburg virus (MARV), 
suggesting that MARV may not bind these drugs. Thermal 
shift assays show up to a 14°C decrease in the protein melting 
temperature after toremifene binding, while ibuprofen has only a 
marginal effect and is a less potent inhibitor. These results suggest 
that inhibitor binding destabilizes GP and triggers premature 
release of GP2, thereby preventing fusion between the viral and 
endosome membranes. Thus, these complex structures reveal the 
mechanism of inhibition and may guide the development of more 
powerful anti-EBOV drugs. 

The recent outbreak of EBOV in West Africa, the worst of more 
than 30 in the past 40 years, comprised more than 28,000 cases and 
over 11,000 deaths?!. In the urgent need to find therapeutics, many 
small compounds and existing Food and Drug Administration (FDA)- 
approved drugs have been screened in vitro or in silico (for exam- 
ple, ibuprofen was suggested by docking experiments’) to find lead 
compounds for drug development or repurpose drugs for the treat- 
ment of EBOV disease’*"!°. Among these, a set of selective oestrogen 
receptor modulators (SERMs) stand out as potential inhibitors from 
in vitro and in vivo studies!*; however, their mechanism of action 
remains largely unknown. Using recombinant EBOV GP we tested 
whether nine such compounds could directly bind by a thermal-shift 
assay (Methods). The results show that toremifene in particular mark- 
edly decreases the melting temperature (T,) of EBOV GP, by up to 
14°C at 100 1M (Fig. 1). This contrasts with the action of inhibitors 
on most protein targets, which tend to increase stability’’, although 
destabilization has been reported before!®. Benztropine’®, the G- 
protein-coupled receptor (GPCR) antagonist, also decreases the T,, of 


2 


GP by 4°C, while other compounds, including ibuprofen, showed T,, 
shifts of less than 2°C (Fig. 1, Extended Data Fig. 1). The destabiliza- 
tion effects of toremifene and ibuprofen are both pH and concentration 
dependent (Fig. 1). The binding constants (Kg values) determined by 
this assay are 16 1M for toremifene and 6 mM for ibuprofen (Extended 


a 
° 
LE 
=== EBOV GP 
56.54 === EBOV GP + toremifene 10 uM 
me EBOV GP + ibuprofen 10 uM 
54.54 
62.5 T T T T T T T T T T T 
45 50 52 55 60 65 7.0 75 80 865 9.0 
pH 
b 68.55 
66.5 4 
64.54 
62.5 5 
60.5 4 
2 58.54 
E 
56.54 
=== EBOV GP + toremifene pH 5.2 
54.55 === EBOV GP + ibuprofen pH 5.2 
52.54 
50.54 
48.5 T T 1 


0.01 0.1 1 10 100 uM 


Concentration (uM) 


0.0001 0.001 


Figure 1 | Summary of thermal-shift assays. a, The effects of toremifene 
and ibuprofen on the melting temperature of EBOV GP at different pHs. 
The raw fluorescence traces are shown in Extended Data Fig. 1. The 
protein melting temperature at pH 5.2 at which the crystals were grown 
is taken as the reference point. b, The melting temperatures of EBOV GP 
at different concentrations of toremifene or ibuprofen, at pH 5.2. Data are 
mean -s.d. (n=3). 


1Division of Structural Biology, University of Oxford, Wellcome Trust Centre for Human Genetics, Headington, Oxford OX3 7BN, UK. 2Cellular Imaging Core, University of Oxford, Wellcome Trust 
Centre for Human Genetics, Headington, Oxford OX3 7BN. *Diamond Light Source Ltd, Harwell Science &lnnovation Campus, Didcot OX11 ODE, UK. 


*These authors contributed equally to this work. 


00 MONTH 2016 | VOL 000 | NATURE | 1 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


{287-293 


“| binding 
site 


(inhibitor- 
binding site) 


C601-C60 


DFF lid 
(inhibitor- 
binding site) 


Figure 2 | Overall structure. a, Cartoon diagram of EROV GP monomer; 
GP1 is in blue, GP2 is in red, and the glycan cap is in cyan. Secondary 
structural elements named as described previously*. Disulfide bonds 
shown as orange sticks, glycans in grey. The mucin domain omitted in 
our construct is shown as a yellow oval. FL, fusion loop. The C-terminal 
inserted foldon trimerization domain is disordered. b, The biological 


Data Fig. 1). Ina mouse model", toremifene appears to be even more 
potent (half-maximum inhibitory concentration (ICs9) ~1 1M). 

The crystal structure of unliganded EBOV GP was determined at 
2.2 A resolution, with good R-factors and stereochemistry (Extended 
Data Table 1). Three copies each of the GP1 and GP2 subunits (Fig. 2a) 
form the biological trimer around the crystallographic three-fold 
axis (Fig. 2b, c). This structure, although crystallized at pH 5.2, 
represents the pre-fusion state of the molecule, with the GP1 recep- 
tor-binding site blocked by a glycan cap (Fig. 2e). GP1 is predomi- 
nantly composed of B-strands, forming a large semi-circular groove 
at the centre of the subunit that clamps the a3 helix and 819-820 
strands of GP2 (Fig. 2d). The glycan cap is removed in the late endo- 
some by cathepsin B/L to expose the receptor Niemann-Pick disease 
type Cl (NPC1)-binding site?”°?!. GP2 catalyses membrane fusion 
and contains N-terminal (a3 and a4) and C-terminal (a5) heptad 
repeats linked by a CX¢CC motif (residues 601-609, Fig. 2a). The 
C-terminal heptad repeat, disordered in all previously published GP 
structures®!°, contributes to the trimer interface in our structure 
(Fig. 2b, c) and contains N618, which is glycosylated as predicted. 
The well-ordered CX¢CC motif forms intrasubunit (C601—C608), 
and intersubunit (C53-C609) disulfide bonds (Fig. 3). Mutation of 
any of these cysteine residues renders the virus incapable of entering 
host cells””. In the fusion process, GP2 undergoes conformational 
changes in which a5 refolds onto a helix coalesced from a3 and 
a4 to form a six-helix bundle” (Extended Data Fig. 2). In our pre- 
fusion structure, the hydrophobic GP2 fusion loop (residues 511-554) 
(Fig. 2a) projects into a neighbouring monomer and is stabilized in a 
shallow depression surrounded by loops 34-5 and 810-811, and a3. 
Apart from residues 521-526, which have very loose interactions 
with the rest of the protein, the fusion loop in this pH 5.2 apo GP is 
very similar to that in the KZ52 Fab complex crystallized at pH 8.3 
(Extended Data Fig. 3). This is in contrast to the large conformational 


2 | NATURE | VOL 000 | 00 MONTH 2016 


Receptor 
and MR78- 
binding site 


trimer viewed perpendicular to the threefold axis with one monomer 
coloured as in a and the second and third faded for clarity. c, The trimer 
viewed along the threefold, towards the viral membrane. d, Close up of the 
inhibitor binding site. e, Close up of the glycan cap and receptor binding 
site. Areas shown in d and e are indicated in panel a. In d and e antigenic 
sites are coloured grey and the receptor binding site yellow. 


changes reported for the isolated fusion loop at different pHs (ref. 24), 
suggesting that GP1 maintains GP2 in the pre-fusion state until their 
separation triggered by receptor binding. 

In total, 319 out of 394 Ca atoms in our apo GP structure match 
with the GP-Fab complex’, with a root mean squared deviation 
(r.m.s.d.) value of 1.1A (Fig. 3a—d), however, there are marked differ- 
ences, beyond the C-terminal heptad repeat and CX¢CC motif. The 
81-(2 hairpin interacts directly with the KZ52 Fab and is pushed 6 A 
inwards in the Fab complex (Fig. 3c). The glycan cap is better ordered 
in the apo structure, revealing an extra strand, 318’, inserted between 
817 and 818, overlapping 818 in the Fab complex but running in the 
opposite direction (Fig. 3b). Several cross-reactive neutralizing mon- 
oclonal antibodies from EBOV survivors bind to the cap”, suggesting 
that this conserved epitope is important for antibody-mediated clear- 
ance of the virus. Our structure defines this conformational epitope. 

In the apo GP structure (excluding the glycan cap), 230 Ca atoms 
overlay with the GP in the NPC1 receptor complex”, with an r.m.s.d. 
of 0.9 A (Fig. 3e). NPC1 binding draws helix «1 approximately 2A 
towards the receptor, causing the preceding 3-helix al’ to unwind, 
disrupting interactions with a3 of GP2, as described previously? 
(Fig. 3f). These structural changes also break hydrogen bonds from 
al’ to the amide groups of residues K510 and N512, disordering the 
N-terminal residues 502-507 of GP2. The structural differences con- 
tinue to the other side of a3. In the NPC1-bound structure, the a3 
helix starts two residues earlier and the 81-62 hairpin bends inwards, 
adopting a conformation similar to that in the KZ52 Fab complex 
(Fig. 3c, g). In addition, there is a large tunnel between neighbouring 
monomers (Fig. 4, Extended Data Fig. 4), whose hydrophobic entrance 
is formed by surrounding residues from the 81-82 hairpin, 83, 86 
and 313 of GP1, and a3, 819 and 820 of GP2. Residues 192-195 (with 
sequence DFFS, and named DFF lid thereafter) form a tight turn with 
F193 and F194 plugging the entrance of the tunnel and making tight 


© 2016 Macmillan Publishers Limited. All rights reserved 


C53-C609 


Figure 3 | Structure comparisons. a, Structure of the apo GP monomer 
compared with the GP of the KZ52 Fab complex. b-d, Details of the 
structural differences at the glycan cap (b), 81-(2 hairpin (c) and the 
CX¢CC motif (d). The apo GP is coloured as in Fig. la, the GP in complex 
with Fab grey. e, Comparison of apo GP with GP from the GP-receptor 
complex shown in same style as a. f, g, close up views of the major 
structural differences at a1’ and al helices (f), and 1-$2 hairpin (g). 


interactions with 819 and 820 (Fig. 4a and Extended Data Fig. 5a). 
This structure may also hold the putative cleavage site residue 190 
in position (Extended Data Fig. 5a)—in the endosome the glycan 
cap is cleaved to free the receptor-binding site of GP1 (refs 25-29), 
which also exposes the entrance of the tunnel. Receptor binding is 
proposed to trigger unwinding of GP2 from GP1 and subsequently 
lead to membrane fusion”’. The above structural changes resulting 
from receptor binding probably all contribute to weaken GP1-GP2 
interactions. Both al’ and al are shielded by residues 287-293 and 
the N563 glycan (which is resistant to enzymatic deglycosylation), 
perhaps preventing premature release of GP2 (Extended Data Fig. 6). 

Structures of GP-toremifene and GP-ibuprofen complexes were 
obtained by crystal soaking and refined to 2.7 A resolution (Extended 


LETTER 


Data Table 1). Both inhibitors have well-defined electron density and 
bind at the same site at the entrance of the large tunnel by expelling 
the DFF lid (Fig. 4 and Extended Data Figs 4, 5, 7). In addition to the 
tunnel entrance residues already mentioned, the tunnel is lined by 
residues from the N-terminal loop, the 81-32 hairpin, 82-83 loop of 
GP1, and a3 and a4 of GP2 from a neighbouring monomer, and inter- 
connected with the other two tunnels in the trimer at the three-fold 
axis (Fig. 4b and Extended Data Fig. 4). Y517 makes dominant inter- 
actions with toremifene by contacting all three pheny] rings (Fig. 4c). 
In addition, phenyl ring A of toremifene is fully buried and interacts 
with V66, L68, L515 and L558, ring B with L186, and ring C with V66 
and A101. The ethyl chloride group interacts with L184, L186, M548 
and L558, while the dimethylethanamine group points towards the 
main tunnel and is surrounded by polar/charged residues, including 
R64, E100, T519, T520 and D522 (Fig. 4c). 

Ibuprofen is bound with its isobutyl group partially overlapping the 
ethyl chloride group of the toremifene but closer to L554. However, its 
phenyl ring does not overlap any of the rings of toremifene (Extended 
Data Fig. 5), but makes extensive interactions with M548 (Fig. 4d). 
The propanoic acid moiety is orientated to make a hydrogen bond 
to the side chain of R64 and hydrophobic interactions with Y517. 
Remarkably, ibuprofen was initially suggested to interact with EBOV 
GP by in silico screening, and predicted to dock in a pocket of the 
mucin domain”. A racemic mixture of ibuprofen was used for all 
experiments, however, we note that the S-isomer (which is also active 
as a painkiller) binds preferentially. 

The flexible region, 521-526, of the fusion loop is stabilized in the 
two inhibitor-bound structures, but in different conformations com- 
pared to apo GP. The most notable conformation changes induced 
by toremifene are at side chains of M548 and L554, and M548 by 
ibuprofen (Fig. 4). The residues involved in inhibitor binding are 
highly conserved across filoviruses, with the exception of MARV 
(Extended Data Fig. 8), where the DFF lid and its preceding loop are 
replaced by a helix, and V66 and A101 are substituted by M50 and E85, 
respectively, partially blocking the binding site*”. 

The SERMs tamoxifen, 4-hydroxytamoxifen and clomiphene are less 
potent inhibitors, despite their chemical similarity to toremifene'*”». 
Compared to the ethyl chloride group of toremifene, the correspond- 
ing ethyl group in tamoxifen and chlorine in clomiphene are expected 
to make weaker interactions with L184, L186, M548 and L558. A par- 
tially bound 4-hydroxytamoxifen structure obtained by crystal soaking 


Figure 4 | Inhibitor-binding site. a, Details of the inhibitor-biding site in 
the apo GP. The backbone is shown as ribbons with GP1 in blue and GP2 
in red, side chains as grey sticks. b, Tunnels of the GP trimer viewed along 
the three-fold axis towards the viral membrane. Toremifenes bound at the 
entrances of the tunnels are shown as yellow sticks. c, d, Details of 


protein-inhibitor interactions of the GP-toremifene complex (c) and 
GP-ibuprofen complex (d). Toremifene is shown as yellow sticks, 
ibuprofen as cyan sticks. Protein main chains are shown as ribbons and 
side chains as sticks (GP1 blue, GP2 red). Side chains in the apo structure 
with large conformational changes are shown as thinner grey sticks. 


00 MONTH 2016 | VOL 000 | NATURE | 3 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


(data not shown) shows the hydroxyl group makes close contacts with 
G67, shifting the whole inhibitor ~1.0 A towards the solvent, weakening 
ring-stacking interactions with Y517 and having no contacts from 
the ethyl group to L184, L186 and L558 compared to toremifene. Our 
crystallographic results are in line with the inhibition data'*“'¢ and our 
thermal-shift assay (Extended Data Fig. le, f). If toremifene and ibu- 
profen inhibit viral infection by causing premature conversion of GP to 
the post-fusion conformation or blocking receptor binding, we would 
expect them to abolish viral fusion. This was confirmed by measuring 
their effect on the fusion of EROV GP pseudoviruses as judged by a 
B-lactamase reporter assay (Extended Data Fig. 9). Benztropine, which 
decreases the T,, of GP by 4°C, could not be soaked in our crystals, 
and needs further investigation (Extended Data Fig. If). 

Our results pinpoint the binding site of toremifene and ibuprofen 
on the surface of the GP and reveal that they decrease the stability 
of the viral GP, and prevent viral fusion. The binding site is different 
to that predicted for ibuprofen’’, and the information on the bind- 
ing modes of these compounds and the spare volume at the binding 
cavity can guide the design of more potent compounds. Finally, our 
readily grown well-diffracting crystals are suitable for fragment-based 
screening for different classes of binders for the development of new 
inhibitors to combat EBOV infection. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 16 March; accepted 26 May 2016. 
Published online 29 June 2016. 


1. Takada, A. et al. A system for functional analysis of Ebola virus glycoprotein. 
Proc. Natl Acad. Sci. USA 94, 14764-14769 (1997). 

2. Hacke, M. et al. Inhibition of Ebola virus glycoprotein-mediated cytotoxicity by 
targeting its transmembrane domain and cholesterol. Nat. Commun. 6, 7688 
(2015). 

3. Aleksandrowicz, P. et a/. Ebola virus enters host cells by macropinocytosis and 
clathrin-mediated endocytosis. J. Infect. Dis. 204 (suppl. 3), S957-S967 (2011). 

4.  Carette, J. E. et al. Ebola virus entry requires the cholesterol transporter 
Niemann-Pick C1. Nature 477, 340-343 (2011). 

5. Coté, M. et al. Small molecule inhibitors reveal Niemann-Pick C1 is essential 
for Ebola virus infection. Nature 477, 344-348 (2011). 

6. Nanbo, A. et al. Ebolavirus is internalized into host cells via macropinocytosis in 
a viral glycoprotein-dependent manner. PLoS Pathog. 6, €1001121 (2010). 

7. Saeed, M. F., Kolokoltsov, A. A., Albrecht, T. & Davey, R. A. Cellular entry of Ebola 
virus involves uptake by a macropinocytosis-like mechanism and subsequent 
trafficking through early and late endosomes. PLoS Pathog. 6, €1001110 (2010). 

8. Lee, J. E. et al. Structure of the Ebola virus glycoprotein bound to an antibody 
from a human survivor. Nature 454, 177-182 (2008). 

9. Wang, H. et al. Ebola viral glycoprotein bound to its endosomal receptor 
Niemann-Pick C1. Cell 164, 258-268 (2016). 

10. Dias, J. M. et al. A shared structural solution for neutralizing ebolaviruses. 

Nat. Struct. Mol. Biol. 18, 1424-1427 (2011). 

11. World Health Organization. Ebola Situation Reports; http://www.who.int/csr/ 
disease/ebola/situation-reports/en/ (2016). 

12. Veljkovic, V. et al. In silico analysis suggests repurposing of ibuprofen for 
prevention and treatment of EBOLA virus disease. F1000Res. 4, 104 (2015). 

13. Edwards, M. R. et a/. High-throughput minigenome system for identifying 
small-molecule inhibitors of Ebola virus replication. ACS Infect. Dis. 1, 380-387 
(2015). 

14. Johansen, L. M. et al. FDA-approved selective estrogen receptor modulators 
inhibit Ebola virus infection. Sci. Transl. Med. 5, 190ra79 (2013). 

15. Johansen, L. M. et al. A screen of approved drugs and molecular probes 
identifies therapeutics with anti-Ebola virus activity. Sci. Transl. Med. 7, 290ra89 
(2015). 


4 | NATURE | VOL 000 | 00 MONTH 2016 


16. Kouznetsova, J. et al. Identification of 53 compounds that block Ebola virus-like 
particle entry via a repurposing screen of approved drugs. Emerg. Microbes 
Infect. 3, e84 (2014). 

17. Lea, W. A. & Simeonov, A. Differential scanning fluorometry signatures as 
indicators of enzyme inhibitor mode of action: case study of glutathione 
S-transferase. PLoS One 7, e36219 (2012). 

18. Cimmperman, P. et a/. A quantitative model of thermal stabilization and 
destabilization of proteins by ligands. Biophys. J. 95, 3222-3231 (2008). 

19. Cheng, H. et al. Inhibition of Ebola and Marburg virus entry by G protein- 
coupled receptor antagonists. J. Virol. 89, 9932-9938 (2015). 

20. Brecher, M. et al. Cathepsin cleavage potentiates the Ebola virus glycoprotein 
to undergo a subsequent fusion-relevant conformational change. J. Virol. 86, 
364-372 (2012). 

21. Zhao, Y., Ren, J., Harlos, K. & Stuart, D. |. Structure of glycosylated NPC1 
luminal domain C reveals insights into NPC2 and Ebola virus interactions. 
FEBS Lett. 590, 605-612 (2016). 

22. Jeffers, S.A. Sanders, D. A. & Sanchez, A. Covalent modifications of the ebola 
virus glycoprotein. J. Virol. 76, 12463-12472 (2002). 

23. Weissenhorn, W., Carfi, A., Lee, K. H., Skehel, J. J. & Wiley, D. C. Crystal structure 
of the Ebola virus membrane fusion subunit, GP2, from the envelope 
glycoprotein ectodomain. Mol. Cell 2, 605-616 (1998). 

24. Gregory, S. M. et a/. Structure and function of the complete internal fusion loop 
from Ebolavirus glycoprotein 2. Proc. Nat! Acad. Sci. USA 108, 11211-11216 
(2011). 

25. Flyak, A. |. et al. Cross-reactive and potent neutralizing antibody responses 
in human survivors of natural Ebolavirus infection. Ce// 164, 392-405 
(2016). 

26. Chandran, K., Sullivan, N. J., Felbor, U., Whelan, S. P. & Cunningham, J. M. 
Endosomal proteolysis of the Ebola virus glycoprotein is necessary for 
infection. Science 308, 1643-1645 (2005). 

27. Dube, D. et al. The primed ebolavirus glycoprotein (19-kilodalton GP1,2): 
sequence and residues critical for host cell binding. J. Virol. 83, 2883-2891 
(2009). 

28. Hood, C. L. et a/. Biochemical and structural characterization of cathepsin 
L-processed Ebola virus glycoprotein: implications for viral entry and 
immunogenicity. J. Virol. 84, 2972-2982 (2010). 

29. Schornberg, K. et al. Role of endosomal cathepsins in entry mediated by the 
Ebola virus glycoprotein. J. Virol. 80, 4174-4178 (2006). 

30. Hashiguchi, T. et a/. Structural basis for Marburg virus neutralization by a 
cross-reactive human antibody. Ce// 160, 904-912 (2015). 


Acknowledgements We thank Diamond scientists at 102 and 103 for assistance 
with data collection, T. S. Walter for help with crystallization and thermal-shift 
essay. Y.Z. was supported by the Biostruct-X project (283570) funded by the 

EU seventh Framework Programme (FP7), J.R. by the Wellcome Trust, and D.LS., 
E.E.F. and K.H. by the UK Medical Research Council (MR/NO0065X/1). This is a 
contribution from the UK Instruct Centre. The Wellcome Trust Centre for Human 
Genetics is supported by the Wellcome Trust (grant 090532/Z/09/Z). A.Z. is 
supported by a Marie Curie Fellowship (658363), T.A.B. is supported by the 
MRC (MR/LO09528/1). SP-P is funded by a Nuffield Department of Medicine 
Leadership Fellowship. D.I.S. is a Jenner Investigator. 


Author Contributions Y.Z., J.R. and D.I.S designed the project. Y.Z. made 
the protein and grew the crystals together with J.R., collected X-ray data 
and determined the structures. K.H. helped with crystal mounting and data 
collection. D.M.J. and S.P. carried out cell imaging experiments. A.Z. and 
T.A.B. provided the cDNA. Y.Z., J.R., E.E.F. and D.I.S. analysed the results and 
wrote the manuscript in discussions with all authors. 


Author Information The atomic coordinates and structure factors have been 
deposited with the RCSB Protein Data Bank under accession codes 5JQ3 
(native GP), 5JQ7 (GP-toremifene) and 5JQB (GP-ibuprofen). Reprints and 
permissions information is available at www.nature.com/reprints. The authors 
declare competing financial interests: details are available in the online version 
of the paper. Readers are welcome to comment on the online version of the 
paper. Correspondence and requests for materials should be addressed to 
D.I.S. (dave@strubi.ox.ac.uk). 


Reviewer Information Nature thanks E. Saphire, W. Weissenhorn and the other 
anonymous reviewer(s) for their contribution to the peer review of this work. 


© 2016 Macmillan Publishers Limited. All rights reserved 


METHODS 


No statistical methods were used to predetermine sample size. The experiments 
were not randomized, and investigators were not blinded to allocation during 
experiments and outcome assessment. 

Protein cloning, expression and purification. Zaire EBOV (strain Mayinga-76) 
glycoprotein extracellular domain DNA was synthesized (UniProt entry 
KB-Q05320). The expression construct GPA contains two directly linked sections, 
amino acids 32-312 and 464-632, with a T42A mutation to eliminate N40 glyco- 
sylation. At the N terminus of the protein, the four amino acids ETGR were added 
from the expression vector pNeosec*. At the C terminus, a foldon trimerization 
sequence from the bacteriophage T4 fibritin and a 6x His tag were added with 
the sequence: GSGYIPEAPRDGQAY VRKDGEWVLLSTFLGTHHHHHH. The 
endotoxin free pNeosec-GPA plasmid was transiently transfected into the human 
embryonic kidney HEK293T (ATCC CRL11268) cells with polyethylenimine 
(PEL, molecular mass 25 kDa, Sigma). To inhibit the formation of complex glyco- 
sylation, the mannosidase inhibitor kifunensine (Cayman Chemical) was added to 
a final concentration of 541M. After 5 days of transfection, the conditioned media 
was collected, dialysed against PBS and incubated with talon beads (Takara Bio 
Europe SAS) at 15°C for 1h with gentle shaking. The beads were collected and 
washed with PBS plus 5-10 mM imidazole. The protein was eluted with 200 mM 
imidazole in PBS and further purified by size exclusion chromatography with a 
Superdex 200 HiLoad 16/600 column (GE Healthcare) and a buffer of 10 mM MES, 
pH 5.2, 150mM NaCl. 

Thermal-shift assay. 10,1 of solution containing 2|1M glycosylated EBOV GP pro- 
tein, buffered by the addition of 10 il of 850 mM sodium malonate at the desired 
pH, was mixed with 10.1 of 15x SYPRO Orange dye (Thermo Fisher Scientific), 
along with 1011 of 501M compound in 10% DMSO or just 10% DMSO. The mix- 
ture was made up to a total volume of 5011. Samples were placed in a semi skirted 
96-well PCR plate (4 Titude), sealed and heated in an Mx3005p qPCR machine 
(Stratagene, Agilent Technologies) from 24.5 to 98.5°C at a rate of 1°C min. 
Fluorescence changes were monitored with excitation and emission wavelengths 
at 492 and 610nm, respectively. Reference wells, that is, solutions without chem- 
ical, but with the same amount of DMSO, were used to compare the melting 
temperature (T,,). Experiments were carried out in triplicate. Compounds tested 
included toremifene, tamoxifen, 4-hydroxyltamoxifen, anastrozole, benztropine, 
clomipramine, ibuprofen, diacylglycerol kinase inhibitor and benzodiazepine. The 
SERMs endoxifen and clomiphene could not be tested since they bind to SYPRO 
Orange directly. 

Ebola pseudovirus production and titration. HIV-1-derived pseudoviruses 
expressing the Ebola virus envelope glycoproteins (EBOV pseudoparticle, 
EBOVpp) were produced as described previously**. HEK-293T cells in T175 
flasks were transfected with 2 1g pR8AEnv, 2 1g BlaM-Vpr, 1 jug pcREV and 
3 jg of plexm-EBOV_GP plasmids (containing Zaire EBOV GP residues 1-676 
under control of 3-actin/CMV chimaeric promoter). After 8h of transfection, the 
medium was replaced by fresh DMEM with 10% FBS. Viral supernatants were 
collected and concentrated using the Lenti-X Concentrator (Clontech). Virus titres 
were determined by infecting TZM-bl cells (PTA-5659, no mycoplasma contami- 
nation) with a serial dilution of concentrated pseudovirus followed by a 8-Gal assay. 
Since the TZM-bI cells contain a 8-Gal expression cassette with an HIV-1-induced 
promoter infected cells can be identified through hydrolysis of X-gal**. 

BlaM assay and analysis. The 3-lactamase assay** was applied to assess EROVpp 
fusion. 24h before the assay TZM-bl cells were plated at 4 x 104 cells per well in 
black clear-bottomed 96-well plates. On the day of assay, cells were cooled on ice 
before the addition of EBOVpp (MOI 0.5), then centrifuged at 2,100g for 30 min 
at 4°C. Viral supernatants were removed and cells washed with PBS. Then, 1001 
of DMEM plus 10% FBS containing toremifene (15 ,1M or 1.541M), ibuprofen 
(150,1M or 1511M), or the same amount of solvent for the above compounds 
(5% DMSO) was added to each well before placing in a 37 °C, CO; incubator to 
initiate viral entry. After 120 min, cells were loaded with 1 x CCF2-AM from the 
LiveBLAzer FRET—B/G Loading Kit (Life Technologies) and incubated at room 
temperature in the dark for 2h. After CCF2-AM removal, cells were washed with 
PBS and fixed with 2% PFA before viewing. Cells were excited using a 405nm 
continuous laser (Leica) and the emission spectra between 430 and 560 nm were 
recorded pixel by pixel (512 x 512) using a Leica SP8 X-SMD microscope with 
a 20x objective. The ratio of blue emission (440-480 nm, cleaved CCF2-AM) 
to green (500-540 nm, uncleaved CCF2-AM) was then calculated pixel by pixel 
using a customized macro™ for Image] (http://imagej.nih.gov/ij/) with ten different 
observation fields for each condition. A blue/green threshold (fusion threshold) 
was set using HIVNogny Virions containing Vpr-BlaM as a background control to 
provide a fusion detection limit that corresponded to 0.75 + 0.05 BlaM ratio. The 
fusion threshold was calculated recovering the signal (blue/green intensity ratio) 
coming from individual cells plus 2 x s.d. from ~200 cells in the observation field. 
This threshold was then applied to all conditions. Cells above the threshold were 


LETTER 


pseudocoloured in red and cells below the threshold pseudocoloured in blue. ‘Red’ 
cells were then compared with blue cells (non-fusogenic) as an accurate measure 
of fusion in different conditions. 

Crystallization and inhibitor soaking. For crystallization, the GP was treated with 
endo-3-acetylglucosaminidase F1 at 37°C for 1h, further purified with size-exclusion 
chromatography and concentrated to 10-12 mgml!. Crystallization screen exper- 
iments were carried out using the nanolitre sitting-drop vapour diffusion method 
in 96-well plates as previously described*®. Crystals were initially obtained from 
Hampton Research PEGR«x 1 screen, condition 37 containing 10% (w/v) PEG 6,000 
and 0.1 M sodium citrate tribasic dihydrate (pH 5.0). The best crystals were grown 
in the optimized condition containing 9% (w/v) PEG 6,000 and 0.1 M sodium 
citrate tribasic dihydrate at pH 5.2. 

To obtain GP-inhibitor complexes, crystal-soaking experiments were 

performed. Crystal-soaking solutions were prepared by first dissolving the inhib- 
itors in 100% DMSO and then diluting with 15% (w/v) PEG 6,000 and 0.1 M 
sodium citrate tribasic dihydrate (pH 5.0) to a final DMSO concentration of 10%. 
The inhibitor concentration was typically from 1 to 10mM depending on solubility. 
Crystals were soaked in the above solutions for between 20 min and 5h. Crystals 
soaked for longer normally diffracted less well and the crystals from which the 
GP-toremifene and GP-ibuprofen complex structures were obtained were only 
soaked for 20 min. 
Data collection and structure determination. Crystals were cryo-protected using 
solutions containing 75% crystallization liquor (or inhibitor soaking solution) and 
25% (v/v) glycerol and frozen in liquid nitrogen before data collection. All data 
were collected from frozen crystals at 100 K. Data were acquired as 0.1° images on 
PILATUS 6M detectors at Diamond Light Source, using beamline 103 for native 
data (exposure time 0.1 s per frame, beam size 80 x 20,1m and 30% beam transmis- 
sion), and 102 for inhibitor soaked crystals (exposure time 0.05 s per frame, beam 
size 90 x 254m and 40% beam transmission). Diffraction images were indexed, 
integrated and scaled with the automated data processing program Xia2-3dii**. The 
native data set was collected from four crystals to 2.23 A resolution with 58-fold 
redundancy. A total of 7 inhibitors were soaked, including toremifene, tamoxifen, 
4-hydroxyltamoxifen, raloxifene, clomiphene, ibuprofen and benztropine, and 
diffraction data were collected with resolutions ranging from 3.5 to 2.3 A. 

The crystals belong to space group R32 with unit cell dimensions a= b= 114.0 A 
and c=307.0A approximately. The apo structure was determined by molecular 
replacement with MOLREP” using the GP structure of the GP-KZ52 Fab com- 
plex (PDB ID, 3CSY) as a search model. There is one GP molecule in the crystal 
asymmetric unit. The biological trimer is formed by a crystallographic three- 
fold axis. Structure refinement used REFMAC** and models were rebuilt with 
COOT”. The apo structure was refined to 2.23 A resolution with an Ryork value 
of 0.223 (Rfree= 0.251) and good stereochemistry. Close examination of the data 
from inhibitor-soaked crystals showed that only toremifene and ibuprofen were 
fully bound with GP, and structures were refined to resolutions of 2.69 A and 
2.68 A, respectively. 4-hydroxyltamoxifen was only bound with partial occupancy 
(data not shown). Data collection and structure refinement statistics are given in 
Extended Data Table 1. Structural comparisons used SHP”°, figures were prepared 
with PyMOL". 


31. Zhao, Y., Ren, J., Padilla-Parra, S., Fry, E. E. & Stuart, D. |. lysosome sorting of 
8-glucocerebrosidase by LIMP-2 is targeted by the mannose 6-phosphate 
receptor. Nat. Commun. 5, 4321 (2014). 

32. Desai, T. M. et al. IFITM3 restricts influenza A virus entry by blocking the 
formation of fusion pores following virus-endosome hemifusion. PLoS Pathog. 
10, e1004048 (2014). 

33. Cavrois, M., De Noronha, C. & Greene, W. C. A sensitive and specific enzyme- 
based assay detecting HIV-1 virion fusion in primary T lymphocytes. 

Nat. Biotechnol. 20, 1151-1154 (2002). 

34. Jones, D. M. & Padilla-Parra, S. Imaging real-time HIV-1 virion fusion with 
FRET-based biosensors. Sci. Rep. 5, 13449 (2015). 

35. Walter, T. S. et a/. A procedure for setting up high-throughput nanolitre 
crystallization experiments. Crystallization workflow for initial screening, 
automated storage, imaging and optimization. Acta Crystallogr. D 61, 651-657 
(2005). 

36. Winter, G., Lobley, C. M. & Prince, S. M. Decision making in xia2. Acta Crystallogr. 
D 69, 1260-1273 (2013). 

37. Vagin, A. & Teplyakov, A. Molecular replacement with MOLREP. Acta Crystallogr. 
D 66, 22-25 (2010). 

38. Winn, M. D., Murshudov, G. N. & Papiz, M. Z. Macromolecular TLS refinement in 
REFMAC at moderate resolutions. Methods Enzymol. 374, 300-321 (2003). 

39. Emsley, P. & Cowtan, K. Coot: model-building tools for molecular graphics. 
Acta Crystallogr. D 60, 2126-2132 (2004). 

40. Stuart, D. |., Levine, M., Muirhead, H. & Stammers, D. K. Crystal structure of cat 
muscle pyruvate kinase at a resolution of 2.6 A. J. Mol. Biol. 134, 109-142 
(1979). 

41. DeLano, W. L. & Lam, J. W. PYMOL: A communications tool for computational 
models. Abstr. Pap. Am. Chem. Soc. 230, U1371-U1372 (2005). 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a b 
#18000; #16000 
= H5.0 2 
5 16000} be 514000! ae 
8 14000; 3 120001 
9 12000) $ 10000] 
8 10000 & 8000] 
ry 8000; 8 
§ 6000) = 
8 GP + Toremifene 8 4000} 
3 4000) === GP + Ibuprofen 3 
g 2000) ne EONS 5 2000) —=GP + Toremifene 
xz : ; : z 2 er soso 
20 40 60 80 100 °C 20 40 60 80 100 °C 
Temperature Temperature 
c d 
5 16000 512000 
a 14000; a 
10000) 
8 12000} $' 
$ 10000} $ 8000) 
g 8000) 8 60001 
§ 6000 8 —P + Toenifene 
4000, —=GP+ jn 
g 4000) 3 str 
$ 2000] q 2000) 
ra oO mt 04 
20 40 60 80 100 °C 20 40 60 80 100 °C 
Temperature Temperature 
e f 
2 18000) 212000) 
5 510000 
8 14000) 3 
a 5 8000] 
10000 8 sooo! 
g 8000 8 
= € 4000) 
: : = 
Fy Fy 20004 Ea e volta inase inl r 
oO o —= GP + Benztropine mesylate 
Ss Ss 
iL L 0 4 
20 40 60 80 100 °C 20 40 60 80 100 °C 
Temperature Temperature 
g 18) 8 
= 16) = 
3 a 36 
s 
S$ 421 > 
$ 10 $4 
a i 
SB 6 Kd: 1624 pM BY Kd: 6£2 mM 
—e 4 € 
a 4 5 0 
| 
0 200 400 600 800 1000 1200HM 0 2 4 6 8 10 12 14 16mm 
Toremlfene Concentration Ibuprofen Concentration 
Extended Data Figure 1 | Thermal-shift essay. Representative thermal with diaglycerol kinase inhibitor, anastrozole and benztropine mesylate at 
melt curves of EBOV GP with 10 |tM compounds and 2% DMSO. pH 5.2. g, h, Shifts in melting temperature (AT, °C in absolute value) were 
a-d, Melting curves of EBOV GP with toremifene, ibuprofen or protein plotted against different concentrations of toremifene (g) or ibuprofen 
alone at pH 5.0, 6.0, 7.0 and 8.0, respectively. e, Small effects of SERM (h) at pH 5.2. Data are mean +s.d. (n= 4). The affinity constant Ky is 
inhibitors tamoxifen, 4-hydroxytamoxifen and raloxifen on the melting calculated by a ligand binding 1:1 saturation fitting with the SigmaPlot 
temperature of EBOV GP shown at pH 5.2. f, Melt curves of EBOV GP version 13 (Systat Software Inc.). 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


— 609 
ef 1089 - S435 5119 Sss6 


™ 


1 : 632 676 


Furin cleavage 


S-S ___190 Putative 501502 ani Sang 


121 147 cathepsin cleavage 
GPA: pe -Foldon- His, 
32 312 464 632 


GP‘1 GP2 


Extended Data Figure 2 | Structural organization of EROV GP and GP2 __A foldon trimerization peptide and a 6x His tag are added at the C 
structure. a, Scheme showing the structural organization of EROV GP. FL, terminus. b, The GP2 trimer in the prefusion state (current structure). 


fusion loop; NHR and CHR, N- and C-terminal heptad repeats; SP, signal The trimer is shown as cartoon representation with the monomers 
peptide; TM, transmembrane helix. The GPA construct used for structure _ coloured in red, green and blue, respectively. Disulfide bonds are shown as 
determination is made by deleting residues 313-463 of the GP mucin orange sticks. c, The six-helix bundle of GP2 in the post-fusion state. 


domain and residues 633-676. Residue 312 is directly linked to 464. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Figure 3 | The fusion loop. a, The fusion loop that in semi-transparent surface representation. b, Comparison of the fusion 
connects 319 and 820 of GP2 projects onto a shallow depression on the loop in the apo GP (red and grey) obtained at pH 5.2 with that in the KZ52 
surface of a neighbouring monomer. The fusion loop is shown asa red coil Fab complex (cyan) obtained at pH 8.3. 

with side chains drawn as grey sticks, the neighbouring monomer is shown 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Figure 4 | Pockets and tunnels in EBOV GP trimer. entrance of each large tunnel and shown as yellow sticks. b, Close up view 
a, The several small pockets and three large tunnels in the GP trimer of a tunnel. Each tunnel is bordered by secondary structure elements from 
shown as grey surfaces. Protein backbones are drawn as ribbons and two neighbouring monomers. 


coloured as in Fig. 2 of the main text. A toremifene is bound at the 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 


Bl Bl 


Extended Data Figure 5 | The inhibitor-binding site. a, The DFF lid (yellow sticks in b) and ibuprofen (cyan sticks in c) bind at the same site 
(residues 192-194, blue coil for main chain and sticks for side chains) by expelling the DFF lid. In both panels, the inhibitor bound structure is 
nestles at the entrance of the large tunnel in the apo structure. The rest shown in blue (GP1) and red (GP2), the apo GP in grey. d, Comparing 

of the protein is shown as an electrostatic surface. The putative cathepsin the binding modes of toremifene and ibuprofen. The toremifene-bound 
cleavage site at residue 190 is indicated by an arrow. b, c, Toremifene structure is shown in blue and red, the ibuprofen bound structure in grey. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Receptor 
binding site 


Extended Data Figure 6 | The environment of «1/ and a1 helices. The surfaces of a1’ and «1 helices, which undergo large conformational changes 
upon receptor binding, are protected by the 287-293 loop from the glycan cap domain and the N563 glycan from GP2 in the apo GP. The glycan is 
modelled as Man9GIcNAc2. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


ibuprofen 


Me 
tae So 


< OA 


Extended Data Figure 7 | Chemical structures and electron density maps. a, b, The chemical structures of toremifene (a) and ibuprofen (b). 
c, d, /Fy — F,/ omit electron density maps for toremifene (c) and ibuprofen (d) contoured at 30. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


35 40 45 


Reston LGIVTNSTLKA*: 
Cuevavirus: 1G LLGNNS LTO eee 
Marburg: VLEIASNS :QPemw 


BS _ BS B13 


Zaire: 
Bundibugyo:: 


515 520 550 555 


HYWTTO De LMHN@DGCLIGG 


Zaire. NPN 
Bundibugyo: NPN 
Tai_ Forest: NPNMHYWTAL D- 

Sudan‘ NPNMHYWTAO E+ LDMHNEINALVIEG 

Reston’: NPDMBYYWTAV D: VMHN@NGLIG 
Cuevavirus: NPNBRYWTSRE*: [IMEH@NTIVeG 
Marburg: DAF 


DUDA AD 


eenoee 
Extended Data Figure 8 | Sequence alignment of filovirus GPs. Amino acid sequence alignment of 7 filovirus GPs around the inhibitor-binding site. 


The amino acids that form contacts with toremifene or ibuprofen are coloured in green. Numbering corresponds to the full length Zaire EBOV GP, 
conserved residues are shown in a red background. Secondary structure elements are labelled on the top. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


EBOVpp + 
EBOVpp + DMSO NoEnv + DMSO 15 uM Toremifene 


EBOVpp + EBOVpp + EBOVpp + 
1.5 uM Toremifene 150 uM Ibuprofen 15 uM Ibuprofen 
1- 
= ns 
5 os + | 
2 | 
3 
iL 
= 0.6 + 
pice 
g 
to 04> * 
o + 
oc 
x 
0.2 + EE 
er HE 
5 ch _ | 


O O x x 
=) ‘= of S 


x Ox? SOS 
PX gdh KO Xo as “e 
2° we o~ Y o¥ o~ 
S \ A2 Ne) \ 
Extended Data Figure 9 | Toremifene and ibuprofen inhibit fusion ratio of blue (440-480 nm, cleaved CCF2-AM) to green (500-540 nm, 
of Ebola GP pseudovirus particles. a, CCF2-loaded TZM-bIl cells uncleaved CCF2-AM) fluorescence measured. Cells are pseudocoloured 
were exposed to EBOV pseudoparticles (EBOVpp) or control particles according to this ratio: blue represents no fusion, red represents fusion. 
lacking envelope proteins (NoENV) at 4°C to synchronise binding and Scale bar: 80|.m. b, The percentage of fusogenic cells (red versus blue) 
receptor engagement before fusion was initiated by shifting cells to 37°C was calculated taking the average max value coming from the negative 
in the presence of toremefine (15,1M and 1.5,1M), ibuprofen (150 1M control as a threshold for fusion, data are means +s.d. (n= 10). *P< 0.05, 
and 151M), or just the solvent (5% DMSO). After 2h incubation, *** D < 0.001 (unpaired t-test, compared to the EBOV plus DMSO 
cells were loaded with the CCF2-AM FRET biosensor, fixed and the control). ns, not significant (P > 0.05). Error bars represent s.d. 


© 2016 Macmillan Publishers Limited. All rights reserved 


Extended Data Table 1 | Data collection and refinement statistics 


Native GP 


GP-toremifene 


LETTER 


GP-ibuprofen 


Data collection 


Space group 
Cell dimensions 
a, b,c (A) 


a, B,y (°) 
Resolution (A) 


Rierge 


T/ol 
Completeness (%) 
Redundancy 


Refinement 
Resolution (A) 


No. reflections 
Rwork / Rite 


No. atoms 
Protein 
Ligand/glycan/ion 
Water 

B-factors 
Protein 
Ligand/glycan/ion 
Water 

R.m.s. deviations 
Bond lengths (A) 
Bond angles (°) 


114.3, 114.3, 307.4 
90, 90, 120 

94.2-2.23 (2.29-2.23)* 
0.204(---) 


17.4(1.3) 
100(100) 
57.8(15.4) 


94.2-2.23 
36035/1865 
0.226/0.241 


3129 
143 


«Values in parentheses are for highest-resolution shell. 


© 2016 Macmillan Publishers Limited. All rights reserved 


R32 


113.5, 113.5, 306.9 
90, 90, 120 
51.2-2.69 (2.76-2.69) 
0.079 (---) 


20.0 (1.9) 
99.9 (100) 
9.8 (8.6) 


51.2-2.68 
20449/1090 
0.201/0.245 


113.8, 113.8, 306.2 
90, 90, 120 
82.82.68 (2.75-2.68) 
0.143 (--+) 


14.7(1.5) 
99.9(100) 
9.8(8.3) 


82.8-2.68 
20734/1107 
0.199/0.235 


LETTER 


doi:10.1038/nature18317 


A core viral protein binds host nucleosomes to 
sequester immune danger signals 


Daphne C. Avgousti>?, Christin Herrmann*?, Katarzyna Kulej!’, Neha J. Pancholi?*, Nikolina Sekulic*>+, Joana Petrescu 


2,6 
) 


Rosalynn C. Molden°, Daniel Blumenthal’, Andrew J. Paris®, Emigdio D. Reyes”, Philomena Ostapchuk’, Patrick Hearing”, 
Steven H. Seeholzer!®, G. Scott Worthen", Ben E. Black*°, Benjamin A. Garcia+> & Matthew D. Weitzman!? 


Viral proteins mimic host protein structure and function to 
redirect cellular processes and subvert innate defenses!. Small 
basic proteins compact and regulate both viral and cellular DNA 
genomes. Nucleosomes are the repeating units of cellular chromatin 
and play an important part in innate immune responses”. Viral- 
encoded core basic proteins compact viral genomes, but their impact 
on host chromatin structure and function remains unexplored. 
Adenoviruses encode a highly basic protein called protein VII that 
resembles cellular histones*. Although protein VII binds viral DNA 
and is incorporated with viral genomes into virus particles*”, it is 
unknown whether protein VII affects cellular chromatin. Here 
we show that protein VII alters cellular chromatin, leading us to 
hypothesize that this has an impact on antiviral responses during 
adenovirus infection in human cells. We find that protein VII 
forms complexes with nucleosomes and limits DNA accessibility. 
We identified post-translational modifications on protein VII that 
are responsible for chromatin localization. Furthermore, proteomic 
analysis demonstrated that protein VII is sufficient to alter the 
protein composition of host chromatin. We found that protein 
VII is necessary and sufficient for retention in the chromatin of 
members of the high-mobility-group protein B family (HMGB1, 
HMGB2 and HMGB3). HMGB1 is actively released in response to 
inflammatory stimuli and functions as a danger signal to activate 
immune responses’. We showed that protein VII can directly 
bind HMGB1 in vitro and further demonstrated that protein VII 
expression in mouse lungs is sufficient to decrease inflammation- 
induced HMGB1 content and neutrophil recruitment in the 
bronchoalveolar lavage fluid. Together, our in vitro and in vivo 
results show that protein VII sequesters HMGB1 and can prevent 
its release. This study uncovers a viral strategy in which nucleosome 
binding is exploited to control extracellular immune signalling. 

As viruses commandeer cellular functions to promote viral produc- 
tion, they induce numerous cellular changes. Manipulation of host 
chromatin is important for viral takeover of cellular functions}*"". 
Although there are known examples of viral control by manipulating 
gene expression””””, an alternative strategy for immune evasion could 
exploit cellular chromatin to affect extracellular signalling. Genomes 
of DNA viruses are compacted and packaged into virus particles with 
small basic proteins encoded by the host or virus. Adenoviruses encode 
protein VII, a small basic protein packaged with viral genomes**. We 
hypothesized that protein VII contributes to host chromatin mani- 
pulation. We investigated protein VII localization during infection, 


and found it present in both viral replication centres stained for viral 
DNA-binding protein (DBP; Fig. la and Extended Data Fig. 1a), and 
in cellular chromatin stained for histone H1 and 4’,6-diamidino- 
2-phenylindole (DAPI; Fig. 1b). These observations suggest that 
protein VII functions on both viral and host genomes. To determine 
the impact of protein VH on cellular chromatin, we generated cell lines 
with inducible expression. In multiple cell types we observed that pro- 
tein VII accumulation altered nuclear DNA into a punctate appearance 
(Fig. 1c and Extended Data Fig. 1b, c). We tested whether other basic 
proteins produce similar effects on chromatin. Viral core protein V 
and the precursor of protein VII (preVID) localized to nucleoli and did 
not affect chromatin appearance (Extended Data Fig. 1d). Human pro- 
tamine PRM1, a basic protein involved in sperm DNA compaction”’, 
also localized to nucleoli and did not affect chromatin appearance 
(Extended Data Fig. 1d). Taken together, our data demonstrate that 
protein VII is sufficient to alter cellular chromatin and is distinct from 
other small basic proteins. 

To affect cellular chromatin at the nucleosome level during infection, 
we reasoned that protein VII must be abundant and associated with 
histones. Acid extraction of histones'*!> from infected cells revealed 
viral proteins VII and V isolated with cellular histones (Fig. 1d), as 
verified by western blot (Extended Data Fig. 2a) and mass spectrometry 
(MS). Protein VII abundance was comparable to cellular histone levels 
(Fig. 1d). We further analysed association of protein VII with cellular 
chromatin by salt fractionation of nuclei'’. We found protein VII with 
cellular histones and DNA in high-salt fractions (Fig. le and Extended 
Data Fig. 2b-d). Ectopically expressed protein VII is also found in high- 
salt fractions, in contrast to other viral proteins that elute at low salt 
(Fig. le and Extended Data Fig. 2b). These data suggest that protein 
VII is highly abundant and tightly associated with cellular chromatin. 

We hypothesized that protein VII interacts with chromatin by form- 
ing complexes with DNA, histones or nucleosomes, and examined 
protein VII interactions in vitro. Purified recombinant protein VII 
binds to DNA® (Extended Data Fig. 2e,f). We reconstituted nucle- 
osomes in vitro with recombinant histone proteins on 195 base pairs 
(bp) of DNA’. Protein VII changed nucleosome mobility upon native 
gel electrophoresis (Fig. 1f and Extended Data Fig. 2g). We analysed 
native gel bands by denaturing SDS—polyacrylamide gel electropho- 
resis (SDS-PAGE), and confirmed that complexes contained core 
histones with protein VII (Fig. 1f, bottom). Unlike protamines"’, pro- 
tein VII forms complexes with nucleosomes but does not appear to 
replace histones. Next, we examined whether protein VII association 


1Department of Pathology an 


orway. 


Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania 19104, USA. @Division of Cancer Pathobiology, 
Department of Pathology and Laboratory Medicine, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA. 3Cell and Molecular Biology Graduate Group, University of 
Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania 19104, USA. “Department of Biochemistry and Biophysics, University of Pennsylvania Perelman School of Medicine, 
Philadelphia, Pennsylvania 19104, USA. 5Epigenetics Program, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania 19104, USA. ®Villanova University, Villanova, 
Pennsylvania 19085, USA. ’Division of Cell Pathology, Department of Pathology and Laboratory Medicine, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA. 8Division of 
Pulmonary, Allergy, and Critical Care Medicine, Hospital of the University of Pennsylvania, and the Department of Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, 
Pennsylvania 19104, USA. 2Department of Molecular Genetics and Microbiology, School of Medicine, Stony Brook University, Stony Brook, New York 11794, USA. !°Protein and Proteomics Core, 
Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA. ! Division of Neonatology, Children’s Hospital of Philadelphia, Philadelphia, and Department of Pediatrics, University of 
Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania 19104, USA. +Present address: Biotechnology Centre of Oslo and Department of Chemistry, University of Oslo, Oslo 0316, 


00 MONTH 2016 | VOL 000 | NATURE | 1 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Mock 


Ad5 


Ad5 


c DAPI VILHA Merge @ Histone @  » B 
extrac! 3 3 increasing NaCl 
ee] 
29 ——=| H3 
s 
2 
e 
3 
= 
Ss = 
rs 2 
2 8|+beme ne 
1 =e 
* Fel = 
% | a |. 
5) ies] 
a LY + ee ee bi Uv 
e 
9 S| _ < 
6 Te 
x= 
3 + ~~ Cel PS 
f + + Nucs g 
+ VII-His @) 
B 150 by 6165 b 
— 195 bp Nucs e i 
| 195 bp Nucs + VII-His 
«Nucs + VILHis ‘@ 
oO 
pNuce = Digestion beyond 
Native gel 8 core nucelosome 
Coomassie = 
kDa B 
354 2 
25-4 | WR |«Vil-His 2 
sir Cole MNase 10 min 
lou histones 
104 Denaturing gel 50 100 150 200 


Coomassie DNA length (bp) 


Figure 1 | Protein VII is sufficient to alter chromatin and directly binds 
nucleosomes. a, b, Adenovirus serotype 5 (Ad5)-infected small airway 
epithelial cells (SAECs) stained for protein VII (red) with DBP (a) or histone 
H1 (b) (green), and DAPI (grey, blue in merge). hpi, hours post-infection. 

c, Protein- VII-haemagglutinin (HA)-induced cells over 4 days showing HA 
(green) and DAPI (grey, blue in merge). dox, doxycycline. a—c, Scale bars, 

10 pm. d, SDS-PAGE of histone extract from Ad5-infected cells showing 
protein V and protein VII. e, Western blot of chromatin fractionation from 
nuclei of Ad5-infected cells, induced for protein-VII-HA, or untreated. 

f, Protein VII binds to nucleosomes (Nucs). Protein bands from native gel 
stained with Coomassie (top) were subjected to two-dimensional analysis by 
SDS-PAGE (bottom). g, Protein VII protects nucleosome complexes from 
MNase digestion. Bioanalyzer curves represent nucleosomes alone (black) or 
protein-VII-nucleosome complexes (orange). 


with nucleosomes affects DNA wrapping using microccocal nuclease 
(MNase) digestion followed by DNA fragment analysis!”. We found 
that protein VII pauses nucleosomal DNA digestion at ~165 bp, the 
point at which DNA strands cross over the nucleosome dyad (Fig. 1g 
and Extended Data Fig. 3a). By contrast, nucleosome digestion alone 
paused with core particles at ~150 bp, suggesting that protein VII 
encumbers DNA access. Unlike linker histone binding that is depend- 
ent on DNA length’, protein VII protects against MNase digestion on 
the nucleosome core particle of 147 bp (Extended Data Fig. 3b). Protein 
VII alone protects DNA from MNase digestion, as would be expected 
given its role in the viral core. Together, these data demonstrate that 
protein VII binds directly to nucleosomes and limits DNA accessibility 
at the DNA entry/exit site. 

Post-translational modifications (PTMs) on histones are central 
to regulating chromatin structure'*!*. Owing to the histone-like 
nature of protein VII (ref. 3), we hypothesized that it is subject to 


2 | NATURE | VOL 000 | 00 MONTH 2016 


RP-HPLC of histone extract from c DAPI HA 
mock-infected cells 


Merge 


H2B 


Arbitrary units 


40 50 60 70 80 8690 100 
Time (min) 


RP-HPLC of histone extract from Ad5-infected cells 
0.2 


Arbitrary units 
Oo 


LSA SAMA SRA AONE Odd MAMA Add MAD hid MANA MRM MAO Till Mid 
40 50 60 70 80 90 100 110 


Time (min) 


Figure 2 | Post-translational modifications on protein VII contribute 

to chromatin localization. a, RP-HPLC analysis of histone extracts. Viral 
proteins V, VII and preVII are indicated at 24 hpi. b, Primary sequence of 
protein VII with modified residues identified in infected cells. Underlined 
residues represent moieties that may also be modified in identified 
peptides (see Extended Data Fig. 5). ac, acetylated; P, phosphorylated. 

c, Immunofluoresence showing DAPI (grey, blue in merge) and protein VII 
(red) as wild type or with alanine substitutions at PTM sites (APTM), K3A 
or K3Q. Scale bar, 10 jum. 


post-translational modification similar to histones. PreVII was pre- 
viously proposed to be acetylated by N-terminal addition during pro- 
tein synthesis*’. We noted that protein VII contains conserved lysine 
residues within an AKKRS motif?! similar to the commonly modi- 
fied canonical histone motif ARSK’®. We therefore purified protein 
VII from histone extracts over an adenovirus infection time course by 
reverse-phase high-performance liquid chromatography (RP-HPLC; 
Fig. 2a and Extended Data Fig. 4). Consistent with observations from 
histone extracts (Fig. 1d), protein VII levels were comparable to endog- 
enous histones. We digested purified protein VII and pre VII with chy- 
motrypsin to distinguish the two proteins, and analysed peptides by 
tandem mass spectrometry (MS/MS). We identified several PT'Ms, with 
two acetylation sites and three phosphorylation sites the most abundant 
modifications (Fig. 2b and Extended Data Figs 5, 6b). Interestingly, we 
identified acetylation sites on ectopically expressed protein VII but not 
on protein VII in virus particles (Extended Data Fig. 6a). We specu- 
late that this provides a possible mechanism for distinguishing protein 
VII bound to cellular chromatin from protein destined for packaged 
virus. To investigate the relevance of the identified PTMs, we mutated 
modified sites in protein VII. An alanine-replacement mutant for 
all five PTM sites localized to nucleoli instead of cellular chromatin 
(Fig. 2c). Results with individual point mutations suggest that the 
K3 residue is important for chromatin localization, and employing 
glutamine as an acetylation mimic (K3Q) mirrored the pattern of 
wild-type protein (Fig. 2c). Effects induced by protein VII are not due 
to global alteration of histone PTMs since only six PTMs on histones 
H3 and H4 showed minor but significant changes (Extended Data 
Fig. 6c-e and Supplementary Table 1). These data suggest that protein 
VII modification has critical roles during virus infection. 

To determine whether protein VII manipulation of cellular chroma- 
tin is part of a strategy to counteract host defences, we employed MS to 
examine changes in the protein composition of nuclear fractions. We 
compared the total chromatin proteome in the presence and absence 
of protein VII (Fig. 3a and Supplementary Table 2). We identified 
20 proteins that changed significantly across three biological replicates 
(Extended Data Fig. 7 and Supplementary Table 2). The categories of 
proteins most significantly changed upon protein VII expression were 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 18 b Untreated 4 d + dox (VII-HA) 24 hpi Ad5 i 
~® HMGB1 a 1) i) 
16 26 26 26 
. e g 8 _nacl Be NC 8G _NOCh es Ad5-flox-Vil loxP loxP 
° al 
14 ° ° —— we ee wel le owe! H3 
* 
2° 
a ° ie \— lo — Tubulin 
12 o,f ee 
8 €e° aa aaa) [ame ee 
2 ° 
g : 37 SET: 
q 10 ° le me ae -—— | Ses @ HMGB1 
om HMGB2 
2 HMGB3 # -—— | ~ ~- 4] HMieas2 
“4 8 ° poe 
rool 
a 50% input His pull-down 
i] 
6 . c eo + + - — VI-His 
® De -+- + - + —- HMGB1-GST 
# 100 === sine abies GST Input HA-IP 
70 
‘ es —_ «HMGBI-GsT _~ + - + _dox 
2 ee 89 Oe HMGB1 a d 
40 om G G 
re > 
3 35 f= alt get | a Evia Ad5-flox-VIl Vil-deleted 
2-9 6 3 0 3 6 9 12 25 a = ae 4 VIE-His ee - 
Log, ratio (VII-HA induced/uninduced) Coomassie stain 
d e f z j 
No dox 4d+dox No dox 4d+dox Mock Ad5 18 hpi g Mock rAd VII-GFP 293 293-Cre 
a = M 4 10 18 24 30 M 4 10 18 24 30 hpi 
fe] 8 
4 = 5 ——— vil 
= = x 
2 Cena reer | DBP 
oO 
aol 
é: o || ee ee ee er oe eer erent | Tubulin 
oO o 
a aL 
> 
; & 7 a k 
Fé a o 293 293-Cre 
< < 
a a 
o ® oo 
26 206 
&2 NaCl 4 az NaCl 
® & = | [= ore] H3 
@® o 
= = 
— [~ Tubulin 
x 
A549 cells A549 cells SAECs SAECs 8 | DBP 
h 
A549 cells with inducible protein-VII-HA, HMGB1-mGFP transfected Quantification of FRAP for HMGB1-mGFP | vil 
1.0 — No dox (D = 4.97 um? s*, t,,. = 1.4) ——— — HMGB1 
x 2 — 4d +dox (D = 0.082 um? s“, t,. = 11.93 s) 
3 ia 
8 5 98 S| oe wenn] 13 
= = 
= > 
Bs 0.6 x || [- Tubulin 
N 2 
3 04 = 
§ E 3 ——- ao [a --- DBP 
[} 
# Zz 02 a 
v 2 || Leal | vil 
t+ 0 7 x a 
0 2 4 6 8 — -) 
Bleached diameter 5 um Time (s) and | MGB! 


Figure 3 | Protein VII directly binds HMGB1 and is necessary for 
retention of the alarmin in cellular chromatin. a, Volcano plot for 
proteomics analysis of one representative biological replicate of the high- 
salt fraction. The y axis represents —log, statistical P value and the 

x axis represents log, protein fold-change between uninduced or protein- 
VIl-expressing cells (homoscedastic two-tailed t-test, P< 0.05 red dots; 
n=3 technical replicates). b, Nuclear fractionation shows that HMGB1 
and HMGB2 normally elute from nuclei at low salt concentrations but 
are retained in high-salt fractions by protein- VII-HA. d, day; dox, 
doxycycline. c, Protein VII interacts with HMGB1 in pull-down of 
recombinant HMGB1-glutathione S-transferase (GST) (left, Coomassie- 
stained SDS-PAGE) and immunoprecipitation of HMGB1 (right, western 


related to immune responses (Extended Data Fig. 7c). The top four 
proteins enriched in chromatin fractions by protein VII were SET (also 
known as TAF-1), a protein previously shown to interact with protein 
VIl223, and HMGB1, HMGB2 and HMGB3 (Fig. 3a). The HMGB 
proteins are alarmins with multiple functions as activators of immu- 
nity and inflammation®’. HMGB1 is a nuclear protein normally only 
transiently associated with chromatin”. Cells also release HMGB1 
as an extracellular danger signal that promotes immune responses 
after injury or infection*®. We confirmed increased chromatin asso- 
ciation of HMGB1 and HMGB2 by analysis of fractionated nuclei, 
upon protein VII expression and during adenovirus infection (Fig. 
3b). We verified that these changes are not due to altered HMGB1 
expression levels (Extended Data Fig. 8a, b). We demonstrated direct 
binding of recombinant protein VII to HMGB1 in vitro and confirmed 


blots). d, e, Protein VII expression alters localization of HMGB1 (d) 

and HMGB2 (e). Immunofluorescence shows protein- VII-HA (green) 
colocalized with HMGB1 (d) and HMGB2 (e) (red) in cellular chromatin 
(DAPI, grey, blue in merge). f, Same as d at 18 hpi with Ad5 DBP (green). 
g, Protein- VII-GFP relocalizes HMGB1 (red) to chromatin with DAPI 
(grey, blue in merge). rAd, recombinant adenovirus. d-g, Scale bars, 
10m. h, FRAP experiment with HMGB1-monomeric GFP (mGFP). 
Recovery of FRAP signal in time-course images (left) with quantification 
and diffusion coefficients (right). Scale bar, 5 \1m. D, diffusion coefficient; 
t1/2, halftime of recovery. i, Schematic showing JoxP strategy for deleting 
protein VII. j, Western blots comparing 293 and 293-Cre cells infected 
with Ad5-flox-VI virus. k, Salt fractionation in nuclei from j. 


HMGBI1 co-immunopreciptation with protein VII (Fig. 3c). We visually 
observed reorganization of HMGB1 and HMGB2 distribution upon 
protein VII expression, and at late stages of infection (Fig. 3d-f and 
Extended Data Fig. 8c-e). We also showed reorganization of HMGB1 
distribution by vector transduction to express protein-VII-green 
fluorescent protein (GFP; Fig. 3g and Extended Data Fig. 8f). The 
effect of protein VII on HMGB1 is also conserved across human 
adenovirus serotypes (Extended Data Fig. 8g). We further defined the 
effects of protein VII on HMGB1 mobility by fluorescence recovery 
after photobleaching (FRAP) and found decreased HMGB1 diffusion 
(Fig. 3h). We next investigated whether protein VII is necessary for 
chromatin retention of HMGB1 during virus infection. We used a 
replication-competent adenovirus with loxP sites inserted on either 
side of the protein VII gene, allowing deletion of protein VII during 


00 MONTH 2016 | VOL 000 | NATURE | 3 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Human lung slice infected with AdS HMGB1 DBP. DAPI 
100 um 
Epithelial cells 
Infected cell 
Merge HMGB1 DBP DAPI 
Airway 
Uninfected cell 
Human lung slice with rAd VII-GFP Merge HMGB1_ VII-GFP DAPI 
VII-GFP-expressing cell 
Airway HMGB1__ VII-GFP DAPI 
Epithelial cells 
No VII-GFP 
Untreated Treated 
a a d 
aan a 2 3a ole 2 
= ee ie 
BOS 209 5/8 er ™ BAL fluid 
HMGB1 || GFP LPS inhalation —> HMGB1 ELISA 
To 10 01 |5 24 h neutrophil count 
HMGB2 —— —| ™ 5 days 
BF tie Transduce 
H3 g VII-GFP 
1010 08 10 10 09 | 
c 
e f P= 0.005 
f 420 P= 0.02 25x10 P = 0.005 
5 100} > * 
D = 6 
£ p=0.003 £410 
ee 71 83x 108 
x 
a 60 T 2 ‘ 
= 40 4? x 10 
g 20 Ci 1 x 108 
8 a : & & = O O ; ° Oo Oo 
S27 27 2 Ss ¥ ¥ iPS FF 
> > S x x oe” oY x x 
wg Ve ef SF @ 
m Untreated = Treated ec Ss ES Sc SF 
SS w w 


Figure 4 | Protein VII prevents HMGBI release. a, Precision-cut lung 
slices infected with Ad5 or transduced to express protein- VII-GFP. 
Endogenous HMGBI1 (red) is redistributed in cells with virus (DBP, top) 
and protein- VIH-GFP (bottom). b, Protein-VII-GFP is sufficient to 
inhibit HMGB1 and HMGB2 release in THP-1 cells. Numbers indicate 
relative intensities of bands quantified with ImageJ. c, Enzyme-linked 
immunosorbent assay (ELISA)-based quantification of HMGB1 in 
supernatants from b. Mean + standard deviation (s.d.), n= 4 technical 
replicates, homoscedastic one-tailed t-test. d, Schematic for investigating 
protein VI in a mouse lung injury model. e, Expression of protein- 
VII-GFP decreases HMGB1 in mouse BAL fluid as quantified by ELISA. 
Mean + s.d., biological replicates: nNLps = 4, NGFEP+LPS = 6, NVTI-GFP+LPS = 7, 
homoscedastic one-tailed (P = 0.02) or two-tailed (P= 0.003) t-test. 

f, Neutrophils in bronchoalveolar lavage (BAL) fluid are significantly 
fewer in mice expressing protein- VII-GFP. Mean + s.d., biological 
replicates: ncrp+ips = 6, Nvi-GrP+Lps = 4; Nips = 5, NGEp = 3, Nvi-GEP = 3, 
homoscedastic two-tailed t-test. 


infection of cells expressing Cre recombinase (Fig. 3i, j and Extended 
Data Fig. 9a. b). We fractionated nuclei from infected cells and found 
that HMGB1 and HMGB2 were no longer retained in chromatin when 
protein VII was deleted (Fig. 3k and Extended Data Fig. 9c). Together, 
these data indicate that protein VII is necessary and sufficient to 
promote chromatin association and immobilization of HMGB1. 

We hypothesized that protein VII retains HMGB1 in chromatin 
during natural infection to prevent cellular release and abrogate host 
immune responses. We therefore visualized endogenous HMGBI dur- 
ing adenovirus infection in precision-cut lung slices”” from human 
donors (Fig. 4a). Consistent with cell culture experiments, we demon- 
strate that protein VII is sufficient to relocalize endogenous HMGB1. 
We then tested whether protein VII prevents HMGB1 release in cell 
culture and in vivo models. We expressed GFP or protein- VII-GFP in 
macrophage-like THP-1 cells, and confirmed that protein-VII-GFP 


4 | NATURE | VOL 000 | 00 MONTH 2016 


was sufficient to alter chromatin and HMGB1 localization (Extended 
Data Fig. 9d). Cells were treated to stimulate inflammasomes, and 
HMGBI content was analysed in supernatants. Protein VI expression 
resulted in reduced levels of HMGB1 and HMGB2 in supernatants 
(Fig. 4b, c). Subsequently, we employed a murine model of lipopolysac- 
charide (LPS)-induced lung injury”* to investigate the impact of protein 
VII on HMGB1 release and neutrophil recruitment in vivo (Fig. 4d). We 
confirmed that protein VII was expressed in transduced mouse lungs 
(Extended Data Fig. 10a—c) and retained mouse HMGB1 (Extended 
Data Fig. 9e, f). We exposed mice to inhaled LPS to induce HMGB1 
release and neutrophil recruitment to alveoli. Bronchoalveolar lavage 
fluid obtained 24 h after LPS exposure showed that mice transduced 
to express protein VII had significantly less HMGB1 and fewer neutro- 
phils than mice expressing GFP (Fig. 4d-f). Together, these data suggest 
that protein VII functions in cellular chromatin to retain HMGB1 asa 
mechanism to blunt immune responses. 

In addition to known roles on packaged viral DNA2?*°, we show that 
protein VII interacts with cellular chromatin and binds nucleosomes. 
We suggest that protein VII PTMs contribute to chromatin localization, 
and that protein VII affects the chromatin association of host proteins. 
Finally, we show that protein VII in cellular chromatin leads to seques- 
tration of HMGB family members, contributing to abrogated immune 
responses (Extended Data Fig. 10d). Our study reveals that chromatin 
retention of signalling molecules by a viral protein may represent a 
previously unrecognized immune evasion strategy. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 10 March; accepted 11 May 2016. 
Published online 29 June 2016. 


1. Elde, N. C. & Malik, H. S. The evolutionary conundrum of pathogen mimicry. 
Nature Rev. Microbiol. 7, 787-797 (2009). 

2. Smale, S. T., Tarakhovsky, A. & Natoli, G. Chromatin contributions to the 
regulation of innate immunity. Annu. Rev. Immunol. 32, 489-511 (2014). 

3. Lischwe, M. A. & Sung, M. T. A histone-like protein from adenovirus chromatin. 
Nature 267, 552-554 (1977). 

4. Chatterjee, P. K., Vayda, M. E. & Flint, S. J. Adenoviral protein VII packages 
intracellular viral DNA throughout the early phase of infection. EMBO J. 5, 
1633-1644 (1986). 

5. Vayda, M. E., Rogers, A. E. & Flint, S. J. The structure of nucleoprotein cores 
released from adenovirions. Nucleic Acids Res. 11, 441-460 (1983). 

6. Kang, R. et al. HMGB1 in health and disease. Mol. Aspects Med. 40, 1-116 
(2014). 

7.  Lotze, M. T. & Tracey, K. J. High-mobility group box 1 protein (HMGB1): nuclear 
weapon in the immune arsenal. Nature Rev. Immunol. 5, 331-342 (2005). 

8. Paschos, K. & Allday, M. J. Epigenetic reprogramming of host genes in viral and 
microbial pathogenesis. Trends Microbiol. 18, 439-447 (2010). 

9. Marazzi, |. et al. Suppression of the antiviral response by an influenza 
histone mimic. Nature 483, 428-433 (2012). 

10. Ferrari, R., Berk, A. J. & Kurdistani, S. K. Viral manipulation of the host 

epigenome for oncogenic transformation. Nature Rev. Genet. 10, 290-294 

(2009). 

11. Knipe, D. M. et a/. Snapshots: chromatin control of viral infection. Virology 435, 
141-156 (2013). 

2. Ferrari, R. et al. Adenovirus small E1A employs the lysine acetylases p300/CBP 
and tumor suppressor Rb to repress select host genes and promote 
productive virus infection. Cel/ Host Microbe 16, 663-676 (2014). 

3. Wykes, S. M. & Krawetz, S. A. The structural organization of sperm chromatin. 
J. Biol. Chem. 278, 29471-29477 (2003). 

14. Lin, S. & Garcia, B. A. Examining histone posttranslational modification 

patterns by high-resolution mass spectrometry. Methods Enzymol. 512, 

3-28 (2012). 

15. Shechter, D., Dormann, H. L,, Allis, C. D. & Hake, S. B. Extraction, purification 

and analysis of histones. Nature Protocols 2, 1445-1457 (2007). 

16. Teves, S. S. & Henikoff, S. Salt fractionation of nucleosomes for genome-wide 
profiling. Methods Mol. Biol. 833, 421-432 (2012). 

7. Falk, S. J. et al. Chromosomes. CENP-C reshapes and stabilizes CENP-A 
nucleosomes at the centromere. Science 348, 699-703 (2015). 

18. White, A. E., Hieb, A. R. & Luger, K. A quantitative investigation of linker histone 

interactions with nucleosomes and chromatin. Sci. Rep. 6, 19122 (2016). 

19. Kouzarides, T. Chromatin modifications and their function. Ce// 128, 693-705 
(2007). 

20. Fedor, M. J. & Daniell, E. Acetylation of histone-like proteins of adenovirus 
type 5. J. Virol. 35, 637-643 (1980). 


© 2016 Macmillan Publishers Limited. All rights reserved 


21. Robinson, C. M. et al. Molecular evolution of human adenoviruses. Sci. Rep. 3, 
1812 (2013). 

22. Gyurcsik, B., Haruki, H., Takahashi, T., Mihara, H. & Nagata, K. Binding modes 
of the precursor of adenovirus major core protein Vil to DNA and template 
activating factor |: implication for the mechanism of remodeling of the 
adenovirus chromatin. Biochemistry 45, 303-313 (2006). 

23. Haruki, H., Okuwaki, M., Miyagishi, M., Taira, K. & Nagata, K. Involvement of 
template-activating factor I/SET in transcription of adenovirus early genes 
as a positive-acting factor. J. Virol. 80, 794-801 (2006). 

24. Scaffidi, P., Misteli, T. & Bianchi, M. E. Release of chromatin protein HMGB1 by 
necrotic cells triggers inflammation. Nature 418, 191-195 (2002). 

25. Sapojnikova, N. et a/. Biochemical observation of the rapid mobility of nuclear 
HMGB1. Biochim. Biophys. Acta 1729, 57-63 (2005). 

26. Lu, B. et al. Novel role of PKR in inflammasome activation and HMGB1 release. 
Nature 488, 670-674 (2012). 

27. Koziol-White, C. J., Damera, G. & Panettieri, R. A. Jr. Targeting airway smooth 
muscle in airways diseases: an old concept with new twists. Expert Rev. Respir. 
Med. 5, 767-777 (2011). 

28. Ueno, H. et al. Contributions of high mobility group box protein in experimental 
and clinical acute lung injury. Am. J. Respir. Crit. Care Med. 170, 1310-1316 
(2004). 

29. Johnson, J. S. et al. Adenovirus protein VIl condenses DNA, represses 
transcription, and associates with transcriptional activator E1A. J. Virol. 78, 
6459-6468 (2004). 

30. Karen, K. A. & Hearing, P. Adenovirus core protein VII protects the viral genome 
from a DNA damage response at early times after infection. J. Virol. 85, 
4135-4142 (2011). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements We thank members of the Weitzman laboratory for 
insightful discussions and input, especially R. Dilley and B. Simpson for 
generating reagents. We also thank R. Panetierri and C. Koziol-White for 
providing precision-cut lung slices. We are grateful to D. Curiel for sharing 


LETTER 


recombinant protein-VII-GFP vectors and L. Gerace for anti-protein-VIl 
antibodies. We thank the Penn Vector Core for assistance in purifying 
recombinant vectors, the Penn CDB Microscopy Core for imaging and FRAP 
assistance, and the CHOP Pathology core for immunostaining of mouse 

lungs. We thank members of the Black, Garcia and Worthen laboratories for 
technical help. We thank C. Bassing, |. Brodsky, J. Henao-Mejia, R. Kohli, 

C. Lopez, A. Resnick, S. Shin, K. Spindler and J. Weitzman for advice and critical 
reading of the manuscript. D.C.A. was supported in part by T32 CA115299 
and F32 GM112414. NJ.P. was supported in part by T32 NSOO7180. N.S. was 
supported in part by funding from the American Cancer Society. Research 
was supported by grants from the National Institutes of Health (CA097093 to 
M.D.W., Al102577 and CA122677 to PH., Al118891 and GM110174 to BAG., 
and GM082989 to B.E.B.), the Institute for Immunology of the University of 
Pennsylvania, and funds from the Children’s Hospital of Philadelphia (M.D.W.). 


Author Contributions D.C.A. and M.D.W. conceived the project and designed 
experiments; D.C.A., C.H., N.S., J.P., N.J.P. and E.D.R. performed the experiments; 
D.C.A., C.H. and J.P. generated constructs and cell lines; K.K., R.C.M., S.H.S. and 
B.A.G. performed MS analysis; P.O. and P.H. generated Ad5-flox-VIl virus and 
provided 293-Cre cell line; D.C.A. and D.B. performed the FRAP experiments; 
A.J.P. and G.S.W. conducted all mouse experiments; B.E.B. and B.A.G. designed 
experiments and interpreted the data; D.C.A. and M.D.W. interpreted the 

data and wrote the manuscript and all authors were involved in editing the 
manuscript. 


Author Information All proteomic raw files have been deposited in the Chorus 
database under project number 1047 (https://chorusproject.org/). Reprints 
and permissions information is available at www.nature.com/reprints. The 
authors declare no competing financial interests. Readers are welcome to 
comment on the online version of the paper. Correspondence and requests for 
materials should be addressed to M.D.W. (weitzmanm@email.chop.edu). 


Reviewer Information Nature thanks M. Bianchi and the other anonymous 
reviewer(s) for their contribution to the peer review of this work. 


00 MONTH 2016 | VOL 000 | NATURE | 5 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


METHODS 

Cells. Primary SAECs, U2OS, HeLa, 293, THP-1 and A549 cells were obtained 
from the American Type Culture Collection (ATCC) and grown according to 
the provider's instructions. Cell lines were not authenticated or tested for myco- 
plasma. Acceptor cells for generation of inducible cell lines were provided by 
E. Makeyev and used as previously reported. Protein VII, preVII and V were 
cloned from genomic DNA isolated from HeLa cells infected with adenovirus type 5 
and inserted into the inducible plasmid cassette with a C-terminal HA tag using 
restriction enzymes BsrGI and Agel (primer sequences available upon request). 
Positive clones were selected in DH5« cells, sequenced, and transfected into A549, 
U20S or HeLa acceptor cells along with plasmid expressing the Cre recombinase. 
Recombined clones were selected by puromycin resistance (1 jig ml~!) and induced 
with doxycycline (0.2,.g ml!) to express the desired protein. Protein expression 
was verified by immunofluorescence and western blot. All figures shown are after 
4 days of induction unless otherwise stated. Protein VII and preVII were also 
verified by HPLC purification and MS analysis. Point mutations were generated by 
gene synthesis from Genewiz. 293-Cre cells were provided by P. Hearing. 
Viruses and infections. Wild-type Ad5, Ad9, Ad12 and recombinant adenovirus 
vectors expressing only GFP were propagated in 293 cells as previously described*”. 
Recombinant adenovirus vector with protein- VII-GFP replaced in the El region 
was a gift from D. Curiel*’. Infections were carried out as described previously** 
using a multiplicity of infection of 10 for primary cells and cell lines for Ad5 
infections. Ad9 and Ad12 infections were carried out with a multiplicity of infec- 
tion of 50 and 20, respectively. Ad5-flox-VII was generated by P. Hearing and 
also prepared using standard methods in 293 cells. loxP sites were added flanking 
protein VII in the Ad5 genome resulting in protein VII deletion during infection of 
293 cells expressing Cre recombinase. 

Antibodies. Primary antibodies were purchased from Covance (HA MMS-101R), 
Abcam (H1 ab4269, H3 ab1791, HMGB1 ab18256, HMGB2 ab67282), Millipore 
(H2A 07-146, prosurfactin-C AB3786), and Santa Cruz (Ku86 sc5280, tubulin 
sc69969). The antibodies to DBP, adenoviral late proteins, terminal protein and 
protein VII were gifts from A. Levine*’, J. Wilson*, R. Hay and L. Gerace, respec- 
tively. Secondary antibodies for immunoblotting were obtained from Jackson 
ImmunoResearch and secondary antibodies for immunofluorescence were 
obtained from Life Technologies. 

Immunofluorescence. Cells were grown on glass coverslips in 24-well plates and 
either infected or induced with doxycycline (0.2,.g ml~!). Cells were harvested 
for immunofluorescence at the indicated time points, washed in PBS, fixed in 
4% paraformaldehyde for 15 min and post-fixed with 100% ice-cold methanol 
for 5 min. Coverslips were then blocked and stained as previously described*® 
and mounted using ProLong Gold Antifade Reagent (Life Technologies). 
Immunofluorescence was visualized using a Zeiss LSM 710 Confocal microscope 
(Cell and Developmental Microscopy Core at UPenn) and ZEN 2011 software. 
Images were processed using ImageJ and assembled with Adobe CS6. 
Immunoblotting. Western blot analysis was carried out using standard methods. 
Briefly, equal amounts of total protein lysates were separated by SDS-PAGE and 
transferred to a nitrocellulose membrane (Millipore) for at least 30 min at 30 V. 
Membranes were stained with ponceau to confirm protein loading and blocked 
in 5% milk in TBST containing 0.1% azide. Membranes were incubated with pri- 
mary antibodies overnight, washed for 30 min in TBST and incubated with sec- 
ondary antibodies conjugated to horseradish peroxidase (Jackson Laboratories) 
for 1h. Membranes were washed again and proteins were visualized with Pierce 
ECL Western Blotting Substrate (Thermo Scientific) and detected using a Syngene 
G-Box. 

Mice. All mice were housed in specific-pathogen-free (SPF) conditions in an ani- 
mal facility at the Children’s Hospital of Philadelphia. All studies in mice were 
carried out in accordance with the recommendations in the Guide for the Care and 
Use of Laboratory Animals of the National Institutes of Health and approved by the 
Institutional Animal Care and Use Committee, Children’s Hospital of Philadelphia 
Animal Welfare Assurance Number A3442-01. C57BL/6J male mice aged 8-10 
weeks were used for experiments. Mice were sedated with ketamine and xylazine. 
Once sedated, mice underwent orotrachial intubation, as previously described?’, 
with a 20G angiocatheter from BD. Mice subsequently received 5 x 10'° genome 
copies (GC) of recombinant adenovirus expressing protein- VI-GFP or GFP 
purified by the Penn Vector Core. Four days after infection, mice were exposed 
to aerosolized LPS, 3 mgm! for 30 min as previously described**. One day after 
LPS exposure, bronchoalveolar lavage (BAL) and lung tissue were harvested 
as previously detailed*? and examined for HMGB1 content (ELISA, Chondrex 
6010) and neutrophil count (haematoxylin and eosin stain kit EMD 65044/93). 
Immunostaining was carried out by the CHOP Pathology Core using standard 
methods. A minimum of four biological replicates were used for each condition 
studied. Mice were assigned a random number and colour at the start of the 


experiment and were randomized. Technicians carrying out the experiments were 
blinded to the identity of the samples. Tissue samples were assigned a random study 
number such that the technician performing the analysis was blinded. Unblinding 
for the purpose of data analysis occurred only after all data had been collected. 
Salt fractionation of nuclei. Salt fractionation of nuclei was adapted from estab- 
lished protocols!*“”, Briefly, 2-4 x 107 cells were collected and resuspended in 
2 ml of ice-cold buffer I (0.32 M sucrose, 60 mM KCI, 15mM NaCl, 5mM MgCh, 
0.1mM EGTA, 15 mM Tris, pH 7.5, 0.5mM dithiothreitol (DTT), 0.1 mM PMSF 
and protease inhibitor cocktail from Roche). To dissolve the plasma membrane, 
2 ml ice-cold buffer I supplemented with 0.1% IGEPAL were added and samples 
were incubated on ice for 10 min. The 4 ml of nuclei was layered on 8 ml of ice-cold 
buffer II (1.2 M sucrose, 60 mM KCl, 15mM NaCl, 5mM MgCh, 0.1mM EGTA, 
15mM Tris, pH 7.5, 0.5mM DTT, 0.1mM PMSF and protease inhibitor cocktail 
from Roche) and centrifuged for 20 min at 10,000g and 4°C. The pelleted nuclei 
were resuspended in 400,11 buffer III (10 mM Tris pH 7.4,2mM MgCh, 0.1mM 
PMSF) supplemented with 5mM CaCl, and the DNA was digested to mononucle- 
osomes by addition of 1 unit of MNase (Sigma-Aldrich, N3755). The reaction was 
incubated at 37 °C for 30 min and then stopped by addition of 25 11 of 0.1 M EGTA. 
The samples were centrifuged for 10 min, 350g, at 4°C, and supernatants were set 
aside for western blot analysis. The pellet was resuspended in 400 il of buffer IV 
(70mM NaCl, 10mM Tris pH 7.4,2mM MgCh, 2mM EGTA, 0.1% Triton X-100, 
0.1mM PMSF) with 80 mM salt and rotated for 30 min at 4 °C. The sample was 
centrifuged for 10 min at 350g, 4°C, and the supernatant collected for western blot 
analysis. This step was repeated for salt concentrations in buffer IV of 150mM, 
300mM and 600 mM. The final pellet was resuspended in 400 jl H2O and all sam- 
ples were analysed together by western blot. An aliquot of each supernatant was set 
aside for DNA purification using a PCR purification kit (Qiagen) and analysed by 
agarose gel electrophoresis. Alternatively, 4 x 10” cells were resuspended in 40011 
hypotonic buffer (10 mM HEPES pH 7.9, 1.5mM MgCh, 10mM KCI, 1:1,000 
PMSF, 0.5mM DTT) and incubated on ice for 30 min. The cells were transferred 
to a 1 ml dounce tissue grinder and the cell membranes were gently disrupted 
with 40 strokes of a tight-fitting pestle. The samples were centrifuged for 5 min at 
1,500g and 4°C. The pelleted nuclei were resuspended in 400 il buffer III and the 
fractionation was continued as described earlier. 

Preparation of salt fractions for MS analysis. All chemicals used for prepara- 
tion of MS samples were of at least sequencing grade and purchased from Sigma- 
Aldrich, unless otherwise stated. Only the 600 mM salt fraction was used for 
LC-MS/MS analysis. The 0.1% Triton X-100 detergent was removed from sam- 
ples bofore MS analysis by precipitation using chloroform (CHCI3)-methanol 
(MeOH) precipitation*". The protein pellet from CHCl;-MeOH precipitation was 
resuspended in 6 M urea and 2 M thiourea in 50 mM ammonium bicarbonate. 
Samples were reduced with 10mM DTT for 1h at room temperature and then 
carbamidomethylated with 20 mM iodoacetamide for 30 min at room temperature 
in the dark. Afterwards, alkylated proteins were digested first with endopeptidase 
Lys-C (Wako, MS grade) for 3h, after which the solution was diluted 10 times with 
20mM ammonium bicarbonate. Subsequently, samples were digested with trypsin 
(Promega) at an enzyme-to-substrate ratio of approximately 1:50 for 12h at room 
temperature. The samples were acidified with 5% formic acid (FA) toa pH <3 and 
desalted using Poros Oligo R3 RP columns (PerSeptive Biosystems) packed in a 
P200 stage tip with Cis 3 M plug (3M Bioanalytical Technologies). Purified peptide 
samples were dried by lyophilization and stored at —20°C until further analysis. 
This procedure was carried out for three biological replicas. 

Nano-LC-MS/MS and analysis of salt fractions. Samples were loaded onto a 
16cm C;g-AQ column (inner diameter 751m, 31m beads, Dr, Maisch GmbH, 
Germany) using an Easy nano-flow HPLC system (Thermo Fisher Scientific). 
The nano-LC was coupled to an Orbitrap Fusion Tribrid Mass Spectrometer 
(Thermo Fisher Scientific) via a nanoelectrospray ion source (Thermo Fisher 
Scientific). Peptides were loaded in buffer A (0.1% formic acid) and eluted with 
a 120 min linear gradient from 2-30% buffer B (95% acetonitrile, 0.1% formic 
acid). After the gradient, the column was washed with 90% buffer B. Mass spectra 
were acquired using a data-dependent acquisition method with the TopSpeed set 
with 3-s cycle. Spectra were acquired in the Orbitrap analyser with mass range of 
350-1,200 m/z and 120,000 resolution (200 m/z), with a maximum injection time 
of 50 ms and an AGC target of 5 x 10°. Signals with 2-5 charges were selected for 
HCD fragmentation using a normalized collision energy of 27, a maximum injec- 
tion time of 120 ms and an AGC target of 10,000. Fragments were analysed in 
the ion trap. Raw MS files were analysed by MaxQuant (v.1.5.2.8)” (http://www. 
maxquant.org). MS/MS spectra were searched against the UniProt-human data- 
base (version June 2014, 59,345 entries). All used search parameters were default, 
with the exception of including the match between runs (1 min window) and 
the intensity-based absolute quantification (iBAQ) label-free quantification®. 
The search included variable modifications of methionine oxidation and 


© 2016 Macmillan Publishers Limited. All rights reserved 


N-terminal acetylation, and fixed modification of carbamidomethy] cysteine. 
Each iBAQ value was log, transformed and subsequently normalized by the aver- 
age protein abundance within each run. Biological process association analysis 
and process network enrichment were performed using the GeneGo MetaCore 
pathways analysis package with FDR < 5%; each Gene Ontology term was ranked 
using P-value enrichment. 

Purification of recombinant protein-VII-His. Protein VII was cloned from 
genomic DNA isolated from adenovirus-infected HeLa cells into a pET21a back- 
bone to generate a C-terminal hexahistidine tag. Positive clones were selected in 
DH5a cells, sequenced, and transformed into BL21 (DE3) cells (NEB C2527]). 
The purification of insoluble protein- VII-His was adapted from existing protocols 
to purify histone proteins from Escherichia coli‘, Briefly, BL21 cells were inocu- 
lated from overnight cultures and grown to an optical density of 0.5-0.6 OD2¢60 nm 
induced with 0.1 mM isopropyl-6-p-thiogalactoside (IPTG; Sigma) and harvested 
after 4h at 37°C. Cell pellets were resuspended in a mild buffer (50 mM Tris-HCl 
pH 8.0, 500mM NaCl, 1mM PMSF, 5% glycerol, 2.5 1g ml! aprotinin, leupeptin 
and pepstatin) and disrupted by sonication using a Branson 250 sonifier. The 
lysate was then centrifuged at 27,000g for 20 min at 4°C. The supernatants were 
discarded and pellets were resuspended in a denaturing buffer (50 mM Tris-HCl, 
pH 8.0, 500 mM NaCl, 5% glycerol, 8 M urea). The suspension was centrifuged 
again to eliminate insoluble cell debris and the His-tagged protein was isolated 
using a cobalt resin (ThermoScientific 89964) according to the manufacturer's 
instructions for denaturing conditions. The purified protein was then dialysed 
against water and lyophilized. Purified protein was verified by western blot 
and MS. 

In vitro binding assays. HMGB1-GST (Abnova) or GST (Sigma) were combined 
with recombinant protein-VII-His at equimolar ratios and incubated at 4°C for 
1h. Complexes were then mixed with a cobalt resin (ThermoScientific 89964) to 
bind protein-VII-His and any associated protein and washed three times in the 
binding buffer (50 mM Tris pH 8, 300 mM NaCl, 0.1% IGEPAL). The beads were 
then boiled in sample buffer, separated on a 4-12% NuPage gel and visualized by 
Coomassie staining. 

Nucleosome in vitro binding and MNase digestion assays. Gel shift and MNase 
digestion assays were carried out as previously described!”4**”, Briefly, nucle- 
osomes were reconstituted by incubating purified recombinant histones with ‘601’ 
DNA of either 195 or 147 bp over a series of dialysis. Recombinant protein-VII-His 
was then combined with nucleosomes at various molar ratios, incubated at room 
temperature for 15 min, and analysed by native gel electrophoresis. Complexes 
were also digested with MNase (Affymetrix) by addition of 1 unit per jug of DNA 
for 147 bp nucleosome experiments and 0.1 unit per jug of DNA for 195 bp nucle- 
osome experiments, incubated at 22°C for varying amounts of time followed by 
the addition of EGTA and guanidine thiocyanate to stop the reaction. The DNA 
fragments were then purified using a MinElute PCR purification kit (Qiagen) and 
analysed on an Agilent 2100 Bioanalyzer as previously described”. 

Release assay of HMGB1 in THP-1 cells. THP-1 cells were seeded at a density of 
2 x 10° cells per well in a 24-well plate, and stimulated into macrophage-like cells 
by addition of 1Ongml~' PMA for 48 h. Cells were washed in PBS and transduced 
with recombinant adenovirus vectors expressing only GFP or protein- VII-GFP 
such that >90% of cells were GFP positive. At 48h after transduction, cells were 
washed and 200 ul of serum-free RPMI was added. To stimulate the inflammasome, 
LPS (Sigma-Aldrich L2880) with a final concentration of 0.5 1g ml‘ was added to 
wells and incubated for 2h, then nigericin (Sigma-Aldrich N7143) was added with 
a final concentration of 10|1M for 1h. Supernatants were collected and proteins 
precipitated overnight at 4°C with a final concentration of 20% trichloroacetic acid 
(Sigma), washed with acetone, dried, and resuspended in 1 x LDS sample buffer 
with reducing agent (Invitrogen). For ELISA analysis, supernatants were harvested 
directly and HMGB1 content was detected by the manufacturer’s instructions 
(Chondrex 6010). Cells were also harvested by the addition of 1 x LDS sample buffer 
with reducing agent (Invitrogen) and boiled. Supernatants and lysates were 
analysed together by western blot. 

Acid extraction and RP-HPLC. Histones were prepared for MS analysis as 
detailed previously*®. Nuclei were isolated and histones from infected cells were 
extracted by acid as previously described. The preVII and protein VII vari- 
ants were fractionated using an offline RP-HPLC. Briefly, ~100 1g proteins were 
resuspended in buffer A (0.1% trifluoroacetic acid (TFA) in HPLC-grade water) 
and loaded onto a C;s 51m column (4.6 mm internal diameter x 250 mm, Vydac) 
using a Beckman Coulter (System GoldA) HPLC (buffer A: 0.1% TFA; buffer B: 
95% acetonitrile, 0.08% TFA). The proteins were separated using a gradient 
from 30 to 45% buffer B in 100 min at a flow rate of 0.2 ml min “|. The fractions 
containing the proteins of interest were collected using an automatic fraction 
collector and individual peaks were combined based on their ultraviolet signal. 
The fractions were subsequently dried by vacuum centrifugation and prepared 


LETTER 


for MS (see later). Protein VII was purified from three biological replicates and 
analysed as follows for MS. 

MS analysis of protein VII PTMs. Sample preparation/protein VII. RP-HPLC- 
purified samples of protein VII variants were reduced in 10mM DTT in 50mM 
ammonium bicarbonate for 1 h at 56°C. After cooling to room temperature, sam- 
ples were alkylated in 20 mM iodoacetamide in 50 mM ammonium bicarbonate 
for 30 min in the dark. Samples were digested with chymotrypsin or Arg-C, at an 
enzyme-to-substrate ratio of approximately 1:20 for 8h at 37°C. The samples were 
acidified to a final concentration of 5% formic acid to a pH <3 and desalted using 
P200 stage tip columns packed with Cj; 3 M plug (3M Bioanalytical Technologies). 
Purified peptide samples were dried by lyophilization and stored at —20 °C until 
further analysis. 

Nano-LC-MS/MS analysis of histone PTMs. The nano-LC-MS/MS analysis was 
performed as previously described“. 

Nano-LC-MS/MS analysis of protein VII peptides. The nano-LC-MS/MS analysis 
was performed in triplicate for each sample. Samples were loaded onto a 16cm 
Cis-AQ column (inner diameter 75 1m, 3 jum beads, Dr, Maisch GmbH) using 
an Easy nano-flow HPLC system (Thermo Fisher Scientific). The nano-LC was 
coupled to an Orbitrap Velos Pro Mass Spectrometer (Thermo Fisher Scientific) 
via a nanoelectrospray ion source (Thermo Fisher Scientific). Peptides were loaded 
in buffer A (0.1% formic acid) and eluted with a 45 min linear gradient from 2 to 
30% buffer B (95% acetonitrile, 0.1% formic acid). After the gradient, the column 
was washed with 90% buffer B. Mass spectra were acquired using a data-dependent 
acquisition method with the top 15 most intense ions. Spectra were acquired in 
the Orbitrap analyser with mass range of 350-1,600 m/z and 60,000 resolution 
(400 m/z), with a maximum injection time of 10 ms and an AGC target of 10 x 10°. 
Signals above 1,000 count charges were selected for HCD fragmentation using 
normalized collision energy of 36, a maximum injection time of 100 ms and an 
AGC target of 50,000. Fragments were analysed in the orbitrap. 

Data processing of protein VII spectra. Raw mass spectrometer files were ana- 
lysed using Proteome Discoverer (v.1.4, Thermo Scientific). MS/MS spectra were 
converted to .mgf files and searched against the UniProt adenovirus C serotype 5 
database using Mascot (v.2.5, Matrix Science). Database searching was performed 
with the following parameters: precursor mass tolerance 10 p.p.m.; MS/MS mass 
tolerance 0.05 Da; enzyme chymotrypsin (Promega) or Arg-C (Roche), with two 
missed cleavages allowed; fixed modification was cysteine carbamidomethylation; 
variable modifications were methionine oxidation, serine/threonine/tyrosine phos- 
phorylation, lysine acetylation and methylation, asparagine and glutamine deami- 
dation. Specifically, phosphorylation, acetylation, and methylation were searched 
separately, not as co-existing modifications. Peptides were filtered for <1% FDR, 
Mascot ion score >20 and peptide rank 1. 

Co-immunoprecipitation of protein-VII-HA. A549 cells were induced 
to express protein VII with doxycycline for 4 days as described earlier. 
Approximately 4 x 107 cells were harvested and pelleted for each immuno- 
precipitation reaction. Cell pellets were resuspended in 500 1l of IC wash buffer 
with protease inhibitors (20 mM HEPES pH 7.9, 110mM KOAc, 2mM MgCh, 
150mM NaCl, 0.1% Tween-20, 0.1% Triton X-100) and incubated on ice for 
10 min with intermittent vortexing to disrupt cells. Samples were then incubated 
on ice for 1h with 511 of benzonase (Millipore) added to each sample to digest 
DNA to ~150bp, which was confirmed by DNA isolation and agarose gel analysis. 
Samples were then sonicated in a Diagenode Bioruptre for 30s on and 30s off for 
five rounds at 4°C and centrifuged at 14,000g for 15 min at 4°C. Supernatants 
were then incubated rotating for 1h at 4°C with 30 jl of HA-conjugated magnetic 
beads (Thermo Scientific) and washed three times for 5 min in IC buffer. Isolated 
proteins were eluted with 100 l of 2mg ml”! HA peptide (Thermo Scientific) 
for 20 min rotating at 37°C and separated on an SDS-PAGE gel. For protein 
separation by SDS-PAGE the NuPAGE 1DE System was used (NuPAGE Novex 
4-12% Bis-Tris 1.0mm gels, Invitrogen). Uninduced cells were used as a negative 
control. The immunopreciptation was carried out in biological triplicate and 
pull-down of protein- VII-HA and HMGB1 was confirmed by western blotting 
standard techniques as described earlier. 

Quantitative PCR. Genomic DNA was isolated using the PureLink 
Genomic DNA kit (Thermo Scientific). Quantitative PCR was performed 
using primers specific for viral DBP (5‘-GCCATTGCGCCCAAGAAGAA 
and 5’-CTGTCCACGATTACCTCTGGTGAT), protein VII (5'‘-GCGGGT 
ATTGTCACTGTGC and 5’-CACCCAATACACGTTGCCC), and cellular 
tubulin (5’-CCAGATGCCAAGTGACAAGAC and 5’-GAGTGAGTGACAA 
GAGAAGCC). Values for DBP and protein VII were normalized internally 
to tubulin and to the 4 h time point to control for any variation in virus input. 
RNA was isolated using the RNeasy Mini Kit (Qiagen) and reverse tran- 
scribed using the High Capacity RNA to cDNA Kit (Applied Biosystems). 
Quantitative PCR was performed using primers specific for HMGB1 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


(5'-TAACTAAACATGGGCAAAGGAG and 5'’-TAGCAGACATGGTCTTCCAC) 
and 8-actin (5’-GCACCACACCTTCTACAATGAG and 5’/-GGTCTCAA 
ACATGATCTGGGTC). Quantitative PCR was performed using the standard 
protocol for Sybr Green (Thermo Scientific) and analysed using the ViiA 7 Real- 
Time PCR System (Thermo Scientific). 

Precision-cut lung slice immunofluorescence. Precision-cut lung slices were 
obtained and prepared as previously described””?. De-identified human lung 
tissue from donors was obtained from the National Disease Research Interchange. 
Analysis of human samples was approved by the University of Pennsylvania 
Internal Review Board. Samples were infected with 10° plaque-forming units 
(p.f.u.) of Ad5 per slice or 10? GC of rAd protein-VII-GFP for 24h. Samples were 
fixed in 4% PFA at room temperature for 15 min and washed three times in PBS. 
Samples were permeabilized with 0.5% Triton X-100 and washed twice more in 
PBS. Samples were then incubated with 3% BSA and 0.03% Triton X-100 in PBS 
for 1 h to block. Primary antibodies (DBP or HMGB1) were incubated in the same 
buffer for 1 h and then samples were washed three times in PBS with 3% BSA, 
incubated with secondary antibodies and DAPI for 1h, and washed three more 
times. Whole slices were mounted on slides with mounting solution and imaged 
by confocal microscopy. 

FRAP. Full-length HMGB1 was cloned from pcDNA3.1 Flag-h HMGB1 (Addgene 
31609) into pEGFP-N1 containing a L221K mutation to prevent dimerization of 
GFP molecules*’. A549 cells were induced to express protein VII for 4 days with 
doxycycline in glass-bottom dishes. Cells were then transfected with the con- 
struct that constitutively expresses HMGB1 with a monomeric GFP C-terminal 
tag. FRAP was carried out using standard methods on a Zeiss LSM 710 confocal 
microscope. Diffusion coefficients were calculated using the ‘simFRAP’ algo- 
rithm (http://imagej.nih.gov/ij/plugins/sim-frap/index.html), a simulation based 
approach to FRAP analysis”. 

Statistical analyses. Statistical details are reported in each figure legend. Statistical 
analyses were performed on at least three different biological replicates, unless oth- 
erwise stated in the figure legend. The sample size was chosen to provide enough 
statistical power to apply parametric tests (one- or two-tailed homoscedastic t-test). 
The t-test was considered a valuable statistical test since binary comparisons were 
performed and the number of replicates was limited. Furthermore, we applied the 
homoscedastic t-test assuming that the variance between the two data sets would 
remain homogeneous due to the use of the same cell lines in culture with and 
without protein VII expression. No samples were excluded as outliers (this applies 
to all proteomics analyses described in this manuscript). Proteins with a P value 
smaller than 0.05 were considered to be significantly altered between the two tested 
conditions for two-tailed and one-tailed t-test. Data distribution was assumed to 
be normal but this was not formally tested. The nano-LC-MS/MS analysis was 
performed in triplicate for each sample to determine technical variation. 


31. Khandelia, P, Yap, K. & Makeyev, E. V. Streamlined platform for short hairpin 
RNA interference and transgenesis in cultured mammalian cells. Proc. Nat! 
Acad. Sci. USA 108, 12799-12804 (2011). 


32. 


33. 


34. 
35. 


36. 


37. 


38. 


39. 
40. 
41. 


42. 


43. 
44. 
45. 


46. 
47. 


48. 


49. 


50. 


il: 


Kozarsky, K. F., Jooss, K., Donahee, M., Strauss, J. F,, Ill & Wilson, J. M. Effective 
treatment of familial hypercholesterolaemia in the mouse model using 
adenovirus-mediated transfer of the VLDL receptor gene. Nature Genet. 13, 
54-62 (1996). 

Orazio, N. |., Naeger, C. M., Karlseder, J. & Weitzman, M. D. The adenovirus 
E1b55K/E4orf6 complex induces degradation of the Bloom helicase 

during infection. J. Virol. 85, 1887-1892 (2011). 

Le, L. P. et al. Core labeling of adenovirus with EGFP. Virology 351, 291-302 
(2006). 

Reich, N. C., Sarnow, P., Duprey, E. & Levine, A. J. Monoclonal antibodies which 
recognize native and denatured forms of the adenovirus DNA-binding protein. 
Virology 128, 480-484 (1983). 

Lilley, C. E., Chaurushiya, M. S., Boutell, C., Everett, R. D. & Weitzman, M. D. 

The intrinsic antiviral defense to incoming HSV-1 genomes includes specific 
DNA repair proteins and is counteracted by the viral protein ICPO. PLoS Pathog. 
7, €1002084 (2011). 

Das, S., MacDonald, K., Chang, H.-Y. S. & Mitzner, W. A simple method of mouse 
ung intubation. J. Vis. Exp. 73, €50318 (2013). 

Jeyaseelan, S., Chu, H. W., Young, S. K. & Worthen, G. S. Transcriptional profiling 
of lipopolysaccharide-induced acute lung injury. Infect. Immun. 72, 7247-7256 
(2004). 
ick, J. A. et al. Role of p38 mitogen-activated protein kinase in a murine 
model of pulmonary inflammation. J. /mmunol. 164, 2151-2159 (2000). 
Zaret, K. Micrococcal nuclease analysis of chromatin structure. Curr. Protoc. 
Mol. Biol. Chapter 21, Unit 21.1 (2005). 

Wessel, D. & Fluigge, U. |. A method for the quantitative recovery of protein in 
dilute solution in the presence of detergents and lipids. Anal. Biochem. 138, 
141-143 (1984). 

Cox, J. & Mann, M. MaxQuant enables high peptide identification rates, 
individualized p.p.b.-range mass accuracies and proteome-wide protein 
quantification. Nature Biotechnol. 26, 1367-1372 (2008). 

Schwanhausser, B. et al. Global quantification of mammalian gene expression 
control. Nature 473, 337-342 (2011). 

Tanaka, Y. et al. Expression and purification of recombinant human histones. 
Methods 33, 3-11 (2004). 

Luger, K., Rechsteiner, T. J., Flaus, A. J.. Waye, M. M. & Richmond, T. J. 
Characterization of nucleosome core particles containing histone proteins 
made in bacteria. J. Mol. Biol. 272, 301-311 (1997). 

Hasson, D. et a/. The octamer is the major form of CENP-A nucleosomes at 
human centromeres. Nature Struct. Mol. Biol. 20, 687-695 (2013). 

Sekulic, N., Bassett, E. A., Rogers, D. J. & Black, B. E. The structure of 
(CENP-A-H4)2 reveals physical features that mark centromeres. Nature 467, 
347-351 (2010). 

Kulej, K., Avgousti, D. C., Weitzman, M. D. & Garcia, B. A. Characterization of 
histone post-translational modifications during virus infection using mass 
spectrometry-based proteomics. Methods 90, 8-20 (2015). 

Cooper, P. R. & Panettieri, R. A. Jr. Steroids completely reverse albuterol- 
induced B2-adrenergic receptor tolerance in human small airways. J. Allergy 
Clin. Immunol. 122, 734-740 (2008). 

Zacharias, D. A., Violin, J. D., Newton, A. C. & Tsien, R. Y. Partitioning of 
lipid-modified monomeric GFPs into membrane microdomains of live cells. 
Science 296, 913-916 (2002). 

Blumenthal, D., Goldstien, L., Edidin, M. & Gheber, L. A. Universal approach to 
FRAP analysis of arbitrary bleaching patterns. Sci. Rep. 5, 11655 (2015). 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 24hpi Add mock b Protein VII expression levels 


= 100 
> i) 
oO 
d 
ial oe 
es Sa 
wo 
ZS 50 
oa =i9 
a 
3 28 
tO) ° T 
S) 5 a 
oO ; 
e : Sgr Pee 
U20S © 
c U20S HeLa d A549 


no dox VII-HA no dox VII-HA V-HA — preVII-HA PRM1-HA no dox 


Extended Data Figure 1 | Adenovirus protein VII distorts chromatin. c, Inducible cell lines of U2OS and HeLa expressing protein- VII-HA 
a, Protein VU localizes to cellular chromatin and viral replication centres show chromatin localization and distortion, similar to A549 cells in 
in U2OS cells similarly to SAECs in Fig. la. b, Protein VII messenger RNA __ Fig. Ic. d, Inducible A549 cell lines expressing viral protein V, the 
levels measured by quantitative PCR showing that after 4 days ofinduction precursor for protein VII (preVII) or cellular protamine PRM1 with 


in the A549 cell line, the level of protein VII transcripts is approximately C-terminal HA tags. Although all three proteins possess a large number 
10% of that measured during infection at 16 hpi. Despite the low relative of charged residues, none are sufficient to distort cellular chromatin or 
level, this amount of protein VII is sufficient to cause dramatic changes increase nuclear size as observed with mature protein VII. Scale bars, 
in the nucleus (graph shows mean + s.d., n = 3 biological replicates). 10j.m. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


days post- 
induction 


Nucs + + 
bp ViI-His - + 


native gel, EtBr stain 


Extended Data Figure 2 | Protein VII associates tightly with chromatin 
and binds DNA and nucleosomes in vitro. a, Western blot analysis 
showing protein VII in histone extracts from infected HeLa cells at 

24 hpi. b, Chromatin fractionation of lysates from A549 cells that were 
uninfected (mock) or infected for 24h with Ad5. Viral and cellular 
proteins were detected by western blotting with various antibodies 

as indicated. c, Agarose gel analysis of DNA extracted from nuclear 
fractionation experiments, indicating that the size of DNA is between 

100 and 200 bp and elutes predominantly in the higher-salt fractions. 


; se SS 


——— 
a Acid l b 
Input extract 
= ae, , 
oto LO S | 
kDa_>_ it =] ct 
260 
140 | 
100 
70 
50 | 
40 
35 
25 «VII DBP 
terminal 
15 H 
rotein 
10 P 
anti-VIl western late 
proteins 


o 
{S) 
=) 
Cc 


VH-IIA 


native gel, coomassie stain 


d, Chromatin fractionation of cells induced to express protein VII, 
indicating that protein VII is present in the highest-salt fraction from 
the first day of induction. e, f, Recombinant protein-VII-His binds 
DNA. Incubating increasing molar amounts of protein VII with 195 bp 
DNA results in shifts by native gel electrophoresis, indicating protein- 
VII-DNA complex formation. Staining with either ethidium bromide (e) 
or Coomassie (f) are shown to verify the presence of DNA and protein, 
respectively. g, Ethidium bromide staining shows DNA content of 
nucleosome shifts from gel in Fig. 1f. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Bioanalyzer Analysis of MNase digested nucleosomes and 
protein Vil-nucleosome complexes 


a b 


150bp (ele) 165bp 147bp (@) 


= 195bp Nucs =——147bp Nucs 
=—=195bp Nucs + VII-His = 147bp Nucs + VII-His 


digestion beyond 
core nucelosome 


1 min 0.5 min 
——_ Se 


ine) 
3. 
=] 
5 
Fluoresence intensity 


a 
7) 
c 
2 
= 1 min 
o 
cs) 
c 
o 
7) 
2 
ie} 
= 
Ww 
5 min DrLe [\ 5 min 
eee ee ee 50 100 150 200 
| | DNA length (bp) 
= 
| | 2 
=) 
al 
| | 
= 
oO 
| le 
© 
10 min b 
Weegee eb WY | 
| 50 100 150 200 | 
DNA length (bp) 
ee ee ee | 
Extended Data Figure 3 | Bioanalyzer examination of MNase-digested stopped, DNA was extracted and analysed. Graphs show nucleosomes 
nucleosomes and protein-VII-nucleosome complexes. a, 195 bp in grey and protein-VII-nucleosome complexes in orange. The presence 
nucleosomes or protein-VII-nucleosome complexes were incubated of protein VI completely blocks digestion even after nucleosomes alone 
with MNase for the indicated times, the reaction was stopped, DNA was have been digested well beyond the core particle. In contrast to what 


extracted and analysed. As in Fig. 1g, nucleosomes are shown in black and —_ would be expected for linker histones, protein VII protects the core 
protein-VII-nucleosome complexes in orange. The presence of protein VII _ nucleosome particle from digestion. These data indicate that protein VII 


pauses digestion at 165 bp, suggesting that protein VII is blocking access may be masking the substrate for MNase through complex formation. 
to the DNA. b, 147 bp nucleosomes or protein-VII-nucleosome complexes _ This represents a unique mechanism of nucleosome binding and suggests 
were incubated with MNase for the indicated times, the reaction was a model for blocking DNA access in cellular chromatin during infection. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a Elution time 


£ 
E 
= £ 
y & 
CON 
oO 


59-61min 


; 62min 
63-65min 


35 40 45 50 55 60 65 70 75 80 85 90 95 100 


< Vi | Minutes 


coomassie 


Elution time 
prevVil 


/ 


50 55 60 65 70 75 80 85 90 95 100 105 110 115 
Minutes 


59-61min 
63-65min 


38-41 min 
62min 


52min 


anti-VIIl western 


60 65 70 75 80 8 90 95 100 105 110 115 120 


55 60 65 70 75 80 85 90 95 100 105 110 115 

Minutes 
corresponds to the higher abundance of protein preVII, as seen by HPLC 
in Fig. 2a. b, Western blot analysis of protein VII in HPLC fractions from a. 
c, Time course of infection followed by histone extraction and HPLC 
analysis. MS analysis verified peaks in each sample as indicated. 


Extended Data Figure 4 | Purification of protein VII from infected cells. 
a, Coomassie-stained SDS-PAGE analysis of fractions from RP-HPLC 

in Fig. 2a. The bands in fraction 38-41 min correspond to histone H1. 
Protein VII and V, as indicated, were verified by MS analysis (data not 
shown). The slight upward shift of the protein VII bands in the later peak 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


A Facoriss A k| KJRJSJDJQUHJP VIRIVIRIGIHTY 
eo acetyl 
x 
z 
g 40 
. 20 yr 
182,08212 
dl || je yeNHs ysNHy 
869.7845 966,53149 
° AN | 1 1 
100 1500 
mie 
— Precursor, Precursor-H,0, Precursor-H3O-NH,, Precursor-NHy ——— Immonium — y,y-H.0, y-NH) 
— b,b-H,0, b-NH; 
b - A KJKJRJS DJQHIP VIRIVIRIGIHIY 
1 
so acetyl 
Eo 
S yer 
z 492.28015 
gx 
2 
zg 4'-NH3 yu", bis*-H,0 
En 5 
ae yr-NHy —_ys'-NHy 
869.4738 966.52606 
° he 1500 
mlz 
—— Precursor, Precursor-H,O, Precursor-H,O-NHs, Precursor-NH; = Immonium —— y,y-H,O, y-NHs 
— b,b-H:0, b-NH3 
0 
co G HIYIRJAJPIWGIAJHTKIR 
n acetyl 
Pw 
Foo 
& 
? i be-NHy, ys 
i” 16709922 inatacle ys 610:34363 
~ aries 370.2070 $82,28470 
ch ya-NH bat ye 6 
. 328.19861 f | 43471530 553.3214 796.42834 ie ee 
- ill luicolll Haul thi Lace ality Bi 
20 a0 00 00 700 7200 100 
mlz 
—— Precursor, Precursor-H:O, Precursor-H;O-NH;, Precursor-NH; ——— Immonium —— y,y-NH: 


— b,b-NHs 


4000 


RJTIGRT tIvIDIDIAn IDIATVIVTETETATR 


phosp 
3 175.11900 
: 2000 bro ver 
E ye s70.2567 
2465611 ; 
1000 bel tornses Yu 
118759936 i 
. Yet uot assis 
L 1302 62659 
Mild it Lik, Lda ; 
Be n sa - 
—— Precursor, Precursor-H,O, Precursor-H,O-NH,, Precursor-NH; —— Immonium — y,y-H:0, y-NHs 
—— b,b-H,0, b-NHs 
NJ YJTJPJ tJ PJPJPIVJSsITIvIDIATAN Tart fviviR 
e eg. “PROSE 
& %§ yo+NHs 
z yst Ye" -769.47491 ye! 
i be soaseasy 7548519 95751989 
Sif ye avaisag 
: 17511890 : “a 
% 559.24951 
staal lhl dtl Tr 0 
—— Precursor, Precursor-H,0, Precursor-H;0-NHs, Precursor-NH; == Immonium —— y,y-H.0, y-NH; 
—— b,b-H,0, b-NHs 
f : G NJVJYJWVIRIDJs fvisiofLfR 
phosp 
ar 
Bus 172.07156 
5 
: 
iA [M+2H> 
5 Pe 84445013 ie 
531.32434 core oa 990.47656 
: de 
re a ‘a0 oe ie om 
— Precinr, Preeunori0, Precursor tyO-NHly PrecursorNHy — Timonium y.y-H10,y-NHs 
— b,b-H,0, b-NHs 
Extended Data Figure 5 | Representative mass spectra. a-f, Annotated observed and expected fragment ions of the given peptides. Specifically, 


MS/MS spectra of identified peptides of protein VII containing PTMs 
(a-c, acetylated peptides; d-f, phosphorylated peptides). The images 
represent the observed fragment ions collected using MS/MS collision- 
induced dissociation (CID). Coloured lines represent matches between 


green lines represent not fragmented precursor mass, blue lines represent 
matches with y-type fragments, red lines with b-type fragments, and 
yellow boxed masses represent fragments containing PTM neutral losses 
(for example, ions that lost the phosphorylation during fragmentation). 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 


flrosie\daphne\30hpi\30_3_chymo 4/7/2014 11:51:53 PM 
RT: 8.38 - 20.95 N/A 30_3_chymo #1864 RT: 14.89 AV: 1 NL: 3.99E5 
10 396.01971 NL: 1.08E6 F: FTMS + p NS! Full ms [350.00-1600.00] 
P Base Peak miz= om 
396.00747-396.03123 P 
5 F: FTMS + p NSIFull 1005 ep, 
z 6 ms [350.00-2000.00) 904 
VP 3 MS ad5_chymo_ge 
rf 4 396.01956 | 396.02542 804 
3 
B 2 g 704 
‘ 396.02676 a8 Hen | 396.00897 § pal scr 
‘an NL: 1.5366 5 
aoeorese A K KRSDQHPVRVRGHY Base Peak miz= 2 504 
al rt I 396.00747-396.03123 g 
acetyl F: FTMS + p NSI Full e 404 
6 ms [350.00-1600.00) & 
Inf MS 30_3_chymo 304 396.62 396.75 
a 2nd 395.90 eo 
395.74 op [> e2 397.23 
2 104 395.02 395.41 = | . =5 397.45 397.58 397.92 
o-{_396.03018 306.0179 396.01361 76 815% 396.01398 396.01364 _396.01987 ol =? z= i ih ee Wate z= 
8 10 y 2 8 14 15 16 1 i 19 20 395, Oo 395, 5 396, 0 396 5 397 Oo 397 5 398. 0 
Time (min) mz 
eae 484.26993 AKKRSDQHPVRVRGHY ww: 4.95e8 P evaeonsina ne Genes eco eo ree saa 
Base Peak miz= arg 
484.25506-484.28412 _ 
2 is 484.26968 F: FTMS + p NSI Full wr 
he oe ms [350.00-2000.00] oo4 484.29 
VP 3 MS ad5_chymo_ge 
g A 804 
3 P 484.26996 g 704 ars 
484.26093_ 484.2617 ABt.27002 454.27075 484.26917 __484.26859_484.26773 484.26913 & ood 484.77 
‘a NL: 7.98E7 5 a 
AKKRSDQHPVRVRGHY Base Peak miz= 2 504 
Fy 484.25506-484.28412 g 
F: FTMS + p NSI Full B 404 aeare 
, ms [350.00-1600.00] 2 za 
| nf MS 30_3_chymo 304 405.02 
a 205 EF 
485.27 
2 105 4a3.51 483.77 484.02 4 485.52 485.95 486.28 486.75 487.07 487.26 
4 484.26974 484.26996 484.26953 484.26962 484.27048 484,26297 484.26950 of 4 zz | =4 23 4 Ee Oe 
eee rere err peer ere ee in iL 
Time (min) miz 
Post translational modifications of protein VII ; 
Histone H4 
Protein Name Modification Modification Site Modified Sequence 100% 
fo 
90% 
Acetylation AkKRSDQHPVRVRGHY 80% 
Acetylation AKkRSDQHPVRVRGHY a 70% 
Acetylation GHYRAPWGAHKR S "ac 
Phosphorylation GRTGRETVDDAIDAVVEEAR 8 60% 
Phosphorylation NYtPTPPPVSTVDAAIQTVVRGARR a 5 =me3 
Phosphorylation NYTPtPPPVSTVDAAIQTVVR g 50% =me2 
Phosphorylation WVRDsVSGL & 40% 
o =me1 
Major core protein VII precursor c 30% 
Accession Number 130748 20% 
Acetylation 19 FPSkKMFGGAKKRSDQHPVR ° 
Phosphorylation 18 FPsKMFGGAKKRSDQHPVR 10% 
Acetylation 25 (2) GGAKKRSDQHPVRVRGHY 2 
Acetylation 26 (3) GGAKkRSDQHPVRVRGHY Larrea a ee. Ge 
Acetylation 47 (24) GHYRAPWGAHKR 
Phosphorylation 54 (31) GRTGRETVDDAIDAVVEEAR 
Phosphorylation 71 (48) NYtPTPPPVSTVDAAIQTVVR 
Phosphorylation 73 (50) NYTP£PPPVSTVDAAIQTVVR 
Phosphorylation 182 (159) WVRDsVSGL 
Cc e Significantly changed Histone PTMs 
upon protein VII-HA expression 
ri tC) 
Histone H3 10M 
90% 
100% fo 
90% 80% 
2 9, 
80% 8 70% 
o iy 
iS} 
70% 2 60% 
© 60% es 
Sen 5 % 50% 
5 me3 o 
© 50% > 
g ™me2 = «40% 
Be 40% ra 
oy =me1 cr 30% 
c 30% 
20% 
“ih i U [i ie 
0% aa is | 
a 2) 6 


K4 KQ K14 K18 K23 K27 K36 K56 K79 K122 


Extended Data Figure 6 | See next page for caption. 


© 2016 Macmillan Publishers Limited. All rights reserved 


Extended Data Figure 6 | Acetylated protein VII spectra from virus 
particles and analysis of total histone PTM changes upon protein VII 
expression. a, Liquid chromatography-mass spectrometry 

(LC-MS) analysis of unmodified and modified chymotryptic peptide 
AKKRSDQHPVRVRGHY. On the left, nano-LC-MS-extracted ion 
chromatograms of protein VII peptides identified in the histone extracts 
of adenovirus infected cells (Inf) or viral particles (VP). The top left 
represents the modified form, while the bottom left represents the 
unmodified form. Non-modified forms were detected in both conditions 
for Inf and VP, while the acetylated form was unique for the infected 
sample only (Inf). On the right, full MS spectrum of the modified 

(top) and unmodified (bottom) peptide. Circled mass represents the 
monoisotopic signal of the peptide. b, Summary of post-translational 
modifications detected on protein VII. Peptides shown were identified 
during infection at various time points with the mature protein VII in 


LETTER 


the top row and preVII in the bottom row. The numbers in brackets for 
preVII indicate the location of the same moiety in mature protein VII. 
Acetylation sites were detected in approximately 3% of peptides for mature 
protein VII and 2% of peptides in preVII. Phosphorylation was detected in 
approximately 1% of peptides for mature protein VII and preVII. 

c, d, Quantification of histone H3 (c) and H4 (d) PTMs in protein- 
VII-HA-induced (+dox) and -uninduced (—dox) A549 cells from the 
analysis of crude histone mixtures (n= 3 biological replicates). Positions 
of PTMs are listed along the x axis. Modification type is indicated by 
colour as shown. y Axis represents the cumulative extent of PTMs relative 
to the total histone H3 or H4, respectively. e, Breakdown of the histone 
marks (H3K14ac, H3K27mel, H3K36me3, H4K20mel, H4K20me2 and 
H4K20me3) found to be significantly different (n = 3 biological replicates) 
in terms of relative abundance between the protein- VII-HA-induced and 
-uninduced states (<5% homoscedastic two-tailed t-test). Mean + s.d. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 


Replica 2 


Replica 1 


b Replica 3 


Log, fold change (VII-HA induced vs uninduced; 
<5% homoscedastic test) 


Replica 1 


9.152 
8.420 
10.077 


High mobility group protein B1 HMGB1 
High mobility group protein B2 HMGB2 
Homo sapiens Protein SET SET 

High mobility group protein B3 HMGB3 
HLA class | histocompatibility antigen, A-34 alpha chain HLA-A 
Nucleosome assembly protein 1-like 1 NAPIL1 
Nexilin NEXN 
Coiled-coil-helix-coiled-coil-helix domain-containing protein 2, mitochondrial CHCHD2 
Heterogeneous nuclear ribonucleoprotein F HNRPF 
Cofilin-1 HEL-S-15, 
Golgi integral membrane protein 4 GOLIM4 
LIM domain and actin-binding protein 1 LIMA1 
Heat shock 70 kDa protein 1A/1B HEL-S-103 
Vimentin HEL113 
Sideroflexin-3 SFXN3 
Ras-related protein Ral-A RALA 
Signal recognition particle 9 kDa protein SRP9 
NAD(P) transhydrogenase, mitochondrial NNT 
Casein kinase Il subunit beta CSNK2B 
Putative oxidoreductase GLYR1 N-PAC 


1.165 
3.016 
1.348 
1.419 
0.783 
1.352 
3.138 
1.242 
0.410 
0.911 
0.271 
0.428 
0.339 
0.289 
0.852 
0.591 


Immune response 


Viral infection 


DNA processing 


DNA damage 


Low (a High 


-L0g19 p-value 


VII-HA induced 


Cc Process networks enrichment* 
Immune response_Phagosome in antigen presentation 
Proteolysis_Ubiquitin-proteasomal proteolysis 

Immune response_Antigen presentation 

DNA damage_DBS repair 

Protein folding_Protein folding nucleus 

Transcription_mRNA processing 

DNA damage_Core 

Immune response_Phagocytosis 

Apoptosis_Apoptotic nucleus 

Inflammation_Protein C signaling 

Translation_Regulation of initiation 

Inflammation_MIF signaling 

DNA damage_Checkpoint 

DNA damage_BER-NER repair 

Transcription_Transcription by RNA polymerase II 
Transcription_Chromatin modification 
DNA damage_MMR repair 


Replica 2 
10.176 
VII-HA induced 
8.478 
9.817 
1.386 
3.041 
0.621 
1.991 
0.449 
1.529 
0.591 
0.659 
0.594 
0.296 
0.205 
0.811 
0.172 
0.444 
1.089 
0.189 


Replica 3 Inflammation_TREM1 signaling 


° 
i 
S 


6 8 10 12 
-Log,, p-value 


zB 
a 


antigen processing and presentation of exogenous peptide antigen 
positive regulation of immune response 
antigen processing and presentation of endogenous peptide antigen 
immune response-activating cell surface receptor signaling pathway 
positive regulation of tolerance induction to nonself antigen 
immune response-regulating cell surface receptor signaling pathway 
immune response-activating signal transduction 
immune response-regulating cell surface receptor signaling pathway involved in phagocytosis 
antigen processing and presentation of peptide antigen via MHC class Il 
viral process 
regulation of defense response to virus by virus 
regulation of defense response to virus 
viral life cycle 
establishment_of_integrated_proviral_latenc' 
regulation of RNA polymerase Il transcriptional preinitiation complex assembly 
regulation of transcription initiation from RNA polymerase II promoter 
RNA splicing 
mRNA transport 
mRNA processing 
RNA transport 
establishment_of RNA localization 
signal transduction involved in mitotic DNA integrity checkpoint 
DNA integrity checkpoint 
DNA ligation 
regulation of DNA-templated transcription, initiation 
DNA geometric change 
somatic cell DNA recombination 
DNA metabolic process 
DNA recombination 
iti gulation_of DNA-templated_transcription,_ initiation 
DNA damage response, signal transduction by p53 class mediator resulting in cell cycle arrest 
signal transduction involved in mitotic G1 DNA damage checkpoint 
signal transduction involved in mitotic DNA damage checkpoint 
signal transduction involved in DNA damage checkpoint 
signal transduction in response to DNA damage 
regulation of apoptotic process 
regulation of programmed cell death 
cellular response to DNAdamage stimulus 
response to stress 
DNA repair 
DNA ligation _involved_in DNA r 
signal transduction involved in mitotic cell cycle checkpoint 
positive regulation of ubiquitin-protein ligase activity involved in regulation of mitotic cell cycle transition 
negative regulation of mitotic cell cycle phase transition 
positive regulation of cell cycle arrest 
negative regulation of cell cycle phase transition 
regulation of cell cycle arrest 
negative regulation of G1/S transition of mitotic cell cycle 
negative regulation of cell cycle G1/S phase transition 
egative_regulation_of cell cycle process 


Extended Data Figure 7 | Bioinformatic analysis of proteins enriched 
in the high-salt fraction upon protein VII expression. a, Venn diagram 
showing overlap between three biological replicates of high-salt-fraction 
proteins significantly enriched compared with uninduced cells. 

b, Proteins found significantly enriched in the protein- VII-HA-induced 
state compared with uninduced (<5% homoscedastic t-test) in all three 
biological replicates (“VII-HA induced’ indicates proteins identified only 


in protein- VII-HA-induced condition). c, d, Classification of proteins 
significantly enriched in minimum two out of three biological replicates 
(protein- VII-HA-induced versus uninduced) according to process 
network enrichment and Gene Ontology biological process (GeneGo 
MetaCore pathways analysis package; false discovery rate (FDR) < 5%); 
each Gene Ontology term was ranked using P-value enrichment. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


HMGB1 mRNA levels 


a A549 A549 VII-HA b 315 
%.° ¢-s OC = 3 5 | 
£ ee ‘ T 
© Ad5 (hpi) 8 +dox au me 1 : 
+ Eo 
5S 16 20 24 2 1d 2d 3d 4d im § 05 
SS 
s < xs 
. Q 
Ad5 induction 
Cc days post induction d Ad5 


no dox 1d 2d 3d 4d 12hpi 24hpi 


4 


me 


f mock rAdVII-GEP  gQ 


© mock _ Add 24hpi 


he 


= 

: 8 ne 

a = 

HMGE 
< < 
10) 

D 

10) 10) 

= 

A549 cells SAECs 

Extended Data Figure 8 | Protein VII retains HMGB1 and HMGB2 in expression. d, HMGB1 (green) localization changes between 12 and 24 hpi 
chromatin. a, Western blot of adenovirus-infected or doxycycline-treated of wild-type adenovirus in A549 cells, and adopts a pattern similar to 
A549 cells showing the relative levels of protein VII expression. HMGB1 protein VII as in Fig. la. DBP (red) is shown as a marker of infection, DNA 
levels do not change upon infection or protein VII expression. Tubulin is stained with DAPI (blue in merge). e, Same as d showing that HMGB2 
is shown as a loading control. b, Quantitative PCR analysis of mRNA adopts the same pattern as HMGB1 during Ad5 infection at 24 hpi. 
transcripts of HMGB1 in various cell types as indicated (for A549, n= 3 f, Multiple cells showing the same pattern of HMGBI relocalization upon 
biological replicates; for THP-1, n= 2 biological replicates; mean + s.d.). expressing protein-VII-GFP as in Fig. 3g. g, HMGB1 retention in the 
The levels of HMGB1 do not significantly change. c, Immunofluorescence _ high-salt fraction is conserved across adenovirus serotypes. Western blot 
analysis of a time course of protein-VII-HA (red) induction shown with analysis of HMGB1 from salt-fractionated A549 cells infected with Ad5, 


HMGB1 (green) and DAPI (grey, blue in merge) in A549 cells. Expression Ad9 or Ad12 as shown. Scale bars, 10 jum. 
of protein-VII-HA results in a change to the HMGB1 distribution upon 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a DBP DNA levels b VII DNA levels 
1000 1000 
5 = Me 293 
= = mE 293-Cre 
= 100 400 
® o 
6 3 
3 10 2 10 
= c 
© © 
-_ Fes 
Oo 4 oO 4 
a) a) 
mg 2 
0.1 0.1 
4 6 8 10 12 16 18 20 24 hpi 4 6 8 10 12 16 18 20 24 hpi 
nee) hpi (Ad5-flox-VII) 
Cc » 3 d 
= 0 
9 3 NaCl HMGB1_ =VII-GFP DAPI merge 
si E 5 
[o>] oO = 
N Q 
ne) tii a 
<x r= 
3 23 
Oo “Ss 
= ri 
|| ne 7 
O14 => 
Ke) LL 
ne) 
| . 
S| S| pomenamera_| g 
, |e eed Hs 
e LL 
VII-GFP HMGB1 DAPI merge 2 
Oo 
ce) = 
tc 
n H3 
26 2 | nem — ame 
fz a | 
a 
mouse embryonic ~ mouse embryonic fibroblasts _ Le =——  ~—-+4 GEP 
Oo 
To 
Extended Data Figure 9 | Protein VII is necessary and sufficient for retention of HMGB2. d, THP-1 cells transduced to express protein-VII- 
chromatin retention of HMGB1 in human and mouse cells. GFP results in chromatin distortion and HMGBI retention in chromatin. 
a, b, Replication of Ad5-flox-VII virus on 293 or 293-Cre cells. Immunofluorescence of transduced PMA-treated THP-1 cells showing 
Quantitative PCR analysis of viral genomic DNA over a time course of protein-VII-GFP (green), HMGB1 (red) and DNA (grey, blue in merge). 
infection (a) shows the DBP gene is increasing exponentially in 293 and e, Transduction to express protein-VII-GFP is sufficient to relocalize 
293-Cre cells when infected with Ad5-flox-VII virus. In contrast, PCR mouse HMGB1 in mouse embryonic fibroblast (MEF) cells. f, Salt 
for the protein VII gene (b) demonstrates deletion in 293-Cre cells (n =2 fractionation of mouse embryonic fibroblast cells transduced to express 
biological replicates, mean + s.d.). c, Salt fractionation of 293-Cre cells protein-VII-GFP. Human Ad5 protein VII is sufficient to retain mouse 
infected with wild-type Ad5, indicating that the Cre recombinase does HMGB1 in the high-salt fraction in MEF cells. The control vector 
not interfere with the ability of protein VII to retain HMGB1 in the high- expressing GFP alone does not have this effect. 


salt chromatin fraction. Protein VII is also necessary for the chromatin 


© 2016 Macmillan Publishers Limited. All rights reserved 


a HMGB1 


rAd VII-GFP 


rAd GFP 


prosurfactant-C DAP 


rAd VII-GFP 


rAd GFP 


c d 


[viral 


transcription 


rAd GFP rAd VII-GFP 


Extended Data Figure 10 | Transduction of mouse lungs demonstrating 
expression of GFP or protein-VII-GFP. a, Sections of mouse lungs 
transduced to express protein-VII-GFP or GFP co-stained for HMGB1. 
GFP signal shows multiple cell types transduced in both cases. Protein- 
VII-GFP has a more distinct nuclear signal than GFP, which also appears 
cytoplasmic. Two sections for each condition are shown to indicate 
transduction efficiency. b, Same as a but co-stained for prosurfactant-C 

to mark type I] pneumocytes. Some cells are positive for both, confirming 


Ac 
A €. 


LETTER 


merge 


nerge 


WWW Sy no alarmin 


release 


modified 


protein VII HMGB 


unmodified 
protein VII 


$+ FS —> viral progeny 
packaged viral genomes 


that multiple cell types were transduced. c, Zoomed images of individual 
epithelial cells from mouse lungs showing the characteristic protein- VII- 
GFP pattern colocalizing with DAPI in the nucleus. GFP only is mostly 
cytoplasmic. d, Schematic summarizing function of protein VII during 
infection. Newly synthesized protein VII late during infection can be 
post-translationally modified and binds to HMGB1, sequestering it on the 
cellular chromatin and preventing its release. Unmodified protein VII is 
packaged in viral progeny. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


doi:10.1038/nature18316 


The nature of mutations induced by replication- 


transcription collisions 


T. Sabari Sankar! *, Brigitta D. Wastuwidyaningtyas**, Yuexin Dong!, Sarah A. Lewis” & Jue D. Wang? 


The DNA replication and transcription machineries share a 
common DNA template and thus can collide with each other co- 
directionally or head-on'”. Replication-transcription collisions can 
cause replication fork arrest, premature transcription termination, 
DNA breaks, and recombination intermediates threatening genome 
integrity!’°. Collisions may also trigger mutations, which are major 
contributors to genetic disease and evolution®”!!. However, the 
nature and mechanisms of collision-induced mutagenesis remain 
poorly understood. Here we reveal the genetic consequences of 
replication-transcription collisions in actively dividing bacteria 
to be two classes of mutations: duplications/deletions and base 
substitutions in promoters. Both signatures are highly deleterious 
but are distinct from the previously well-characterized base 
substitutions in the coding sequence. Duplications/deletions are 
probably caused by replication stalling events that are triggered 
by collisions; their distribution patterns are consistent with where 
the fork first encounters a transcription complex upon entering 
a transcription unit. Promoter substitutions result mostly from 
head-on collisions and frequently occur at a nucleotide that is 
conserved in promoters recognized by the major o factor in bacteria. 
This substitution is generated via adenine deamination on the 
template strand in the promoter open complex, as a consequence 
of head-on replication perturbing transcription initiation. We 
conclude that replication-transcription collisions induce distinct 
mutation signatures by antagonizing replication and transcription, 
not only in coding sequences but also in gene regulatory elements. 

Mutations cause genetic diseases and drive evolution by altering 
either the gene-coding sequence or the noncoding elements that con- 
trol gene expression. A variety of mechanisms underlie mutagenesis: 
DNA replication errors, error-prone repair, transcription-associated 
mutagenesis (TAM), and replication-stalling-mediated template 
switch'®!?-!5, Many mutagenic mechanisms depend on two fundamental 
processes—replication or transcription. However, little is known 
about the mutagenic mechanisms involving replication-transcription 
collision, an unavoidable outcome of the two processes sharing the 
same DNA template. Identifying the mutagenic consequences of 
replication-transcription collisions remains an important challenge 
owing to the difficulty of differentiating collision-induced mutation 
events from those induced by either replication or transcription. 

An experimental approach to identifying collision-induced mutagen- 
esis is to analyse the mutagenic consequence of altering the relative 
directionality of transcription to replication”!°"'?. Head-on collisions 
are proposed to generate mutations more frequently than co-direc- 
tional collisions, which may underlie the genome-wide bias for essential 
genes to be transcribed co-directionally to replication®”". In support 
of this hypothesis, in the bacterium Bacillus subtilis—in which 94% of 
essential genes are co-directional!®—base substitution rates are higher 
within genes oriented head-on than co-directional to replication”. 
However, the orientation-dependent difference in substitution rates can 


also be explained by the difference in the fidelity between leading- and 
lagging-strand replication!*'>'”'8, challenging the notion that col- 
lisions generate base substitutions in coding sequence'?’*. Thus, 
conclusive evidence for collision-induced mutations is still lacking 
and necessitates a systematic analysis of collision-generated mutation 
signatures beyond base substitutions in the coding sequence. 

Here, we investigate whether mutations are generated by collisions by 
identifying the signatures and characterizing the mechanisms of muta- 
tions caused by co-directional versus head-on collisions. We first devel- 
oped an assay that can detect a wide range of mutations in B. subtilis. 
We chose the thymidylate synthetase gene thyP3 because any complete 
loss-of-function mutation in thyP3 can be selected using trimethoprim 
resistance (Extended Data Fig. 1a). To evaluate the effect of gene 
directionality on mutagenesis, we placed thyP3 under an isopropyl-6- 
p-thiogalactoside (IPTG)-inducible promoter at a single location on the 
chromosome in either co-directional or head-on orientation (Fig. 1a). 
To estimate mutation rates, we performed the Luria—Delbriick fluctu- 
ation test using multiple growth cultures, selected for thyP3 mutants 
after growth, and statistically determined the rate of spontaneous 
mutations in thyP3 (Fig. 1b)*°*!. Two additional features of our assay 
allowed critical analyses of mutagenesis. First, we chose the non-native, 
phage-encoded thyP3 as the target sequence and deleted thyA, the 
native homologue of thyP3. We avoided using a native gene to evaluate 
the impact of gene directionality on mutagenesis because evolution may 
have already eliminated potential mutation hotspots within a native 
gene in its original orientation. Second, we took advantage of the tem- 
perature sensitivity of a second endogenous gene, thyB, to ensure that 
mutants were not defective during growth, which could complicate 
the measurement of mutation rate. We grew cells at the permissive 
temperature (37°C), during which the functional ThyB masks any 
competitive disadvantage of thyP3 mutations (Extended Data Fig. 1b). 
Selection was done at a non-permissive temperature under which ThyB 
is inactivated and phenotypes associated with thyP3 mutation would be 
exposed (Fig. 1b and Extended Data Fig. 1c). In the presence of thyB, 
mutants follow the Luria—Delbriick distribution (Fig. 1c), demonstrat- 
ing that mutations arise with a constant rate per cell division before and 
not after selection”®”!. Notably, the use of thyB was critical because 
without thyB, mutants followed the Poisson distribution instead of 
the expected Luria—Delbriick distribution (Fig. 1c), presumably due to 
the growth defect of thyP3 mutants (Extended Data Fig. 1d). 

Using this assay, we compared mutations resulting from co-directional- 
versus head-on-oriented thyP3. When induced by IPTG, transcrip- 
tion reaches similar levels from either co-directional or head-on thyP3 
(Extended Data Fig. 2a). A ~60% increase in total mutation rate in 
the head-on thyP3 compared with co-directional thyP3 was observed 
(Fig. 1d and Extended Data Fig. 2b-e). Next, we sequenced ~2,000 
mutants and obtained ~400 distinct mutations (Extended Data Figs 3 
and 4). Only fewer than a third of mutations observed in thyP3 under 
induced transcription were base substitutions within the coding region. 


1Department of Bacteriology, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA. @Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, 


Texas 77030, USA. 
*These authors contributed equally to this work. 


178 | NATURE | VOL 535 | 7 JULY 2016 


© 2016 Macmillan Publishers Limited. All rights reserved 


a oriC oriC b Mutation +IPTG thyP3* thyBis 
thyP3 generation zB A thyP3- thyB's 
ThyB* 
Co- (37 °C) 
directional Head: of bat 
Mutation 
selection 
ThyB- 
YerC (45 °C) |Trimethoprim + IPTG thyP3- 
c d 


Base substitutions 


+ thyBis* in promoter 


© thyBts- 


Log P(r) 


Slope = -1.0 
1 Base substitutions 
1 in coding sequence 
— Luria—-Delbriick * 
-- Poisson 
-1.5 T T 

0 0.5 1.0 1.5 
Log r 


Mutation rate 
per cell per generation (x 10-8) 


Indels 


Head-on 
directional 


2.0 Co- 


Figure 1 | Transcription directionality affects spontaneous mutation 
rates and spectra in B. subtilis. a, thyP3 gene with an IPTG-inducible 
promoter is integrated into the chromosome either co-directionally or 
head-on to replication. Purple arrow indicates replication direction; 

oriC indicates replication origin; and terC indicates replication terminus. 
b, Modified fluctuation test to measure the rate of spontaneous mutations 
conferring trimethoprim resistance. ts, temperature sensitive. c, Distribution 
of mutants: number of mutants per culture (r) plotted against proportion 
of cultures with > r mutants (P(r)). d, The mutation rates in co-directional 
and head-on thyP3 (subdivided by mutation spectra) when transcription 
is induced with IPTG. Mutation rates are expressed as mean + standard 
error of the mean (s.e.m.). 


The remaining majority of mutations fall into two prominent classes: 
insertions/deletions (indels) and promoter base substitutions. Their 
mutation rates are strongly and differentially altered by transcription 
directionality and strength (Extended Data Fig. 2e). These alterations 
are mostly not due to competitive or selection bias of the mutants 
(Extended Data Fig. 5). Further analyses, described later, revealed that 
indels and promoter substitutions are probably induced by replication- 
transcription collisions. 

Indels are probably generated upon stalling of a replication fork 
after collision with a transcription complex or a transcription 
factor*””. First, the majority of indels are duplications/deletions 
between repeated DNA sequences (3-522 base pairs (bp); Extended 
Data Fig. 6a), which were proposed to originate from slippage or tem- 
plate switch of stalled replication forks'*". Second, the frequencies of 
indels at different locations within thyP3 are strongly influenced by its 
transcription orientation and strength (Fig. 2a-f and Extended Data 
Fig. 6b). When thyP3 was co-directional to replication, indels were 
predominantly enriched at the promoter and 5’ half of the coding 
region (Fig. 2a), including promoter-proximal regions where RNA 
polymerases (RNAPs) are known to often pause’. In contrast, when 
thyP3 was head-on, indels were found predominantly within the 3’ 
half (Fig. 2b), a bias that is largely absent when transcription was 
uninduced (at basal level) (Fig. 2c, d). This transcription-dependent 
enrichment pattern reflects the vicinity in which the replication fork 
first encounters a transcription complex upon entering a transcription 
unit (Fig. 2g and Extended Data Fig. 6c). Promoter deletions depended 
on the recombination protein RecA, and thus are mostly caused by 
recombination"? after replication fork collision with the transcription 
initiation complex”? or repressors” (Extended Data Fig. 7). However, 
the distribution of indels within the transcribed sequence was not 
affected by RecA, suggesting that recombination is not necessary for 
their generation. Instead, collision with the transcription elongation 
complex*** stalls replication fork progression, which can induce fork 
slippage, template switch or fork reversal that leads to duplications/ 
deletions, or by collision-generated DNA breaks®° followed by micro- 
homology-mediated break-induced replication or microhomology- 
mediated end joining (Extended Data Fig. 6e, f)'*. Our work thus reveals 


LETTER 


Head-on +IPTG 


— Insertion 
— Deletion 


Co-directional -IPTG d Head-on -IPTG 


EdAy} 


e 


0.8 


Promoter 0.6- Co-directional collision 


5 Deletion 


°o 
BR 

i 
me 


Duplication 


3 


Indel rate 


SF GReplisome @ RNAP yf Collision 
site 


So 
io 
1 


2 
iy 


Head-on collision 


1 
1 
1 
1 
1 5 Deletion 


per cell per generation (x 10° 
oO 
LL 
Indel rate 
per cell per generation (x 10-8) 


Oo 


-IPTG +IPTG 
0 Co-directional 
Head-on 


Ot 
= = Co-directional IPTG r 3 
— Co-directional +IPTG | Duplication 5 
— — Head-on -IPTG 

— Head-on +IPTG 3 


Figure 2 | Distribution of indels is strongly dependent on transcription 
directionality and strength. a—d, Positional distribution of indels of >3 bp 
in co-directional and head-on thyP3 under induced (+IPTG) 

and uninduced (—IPTG) conditions. Each bar represents an insertion 
(black) or deletion (red). e, The rates of >3 bp indels at the promoter. 

f, The rates of >3 bp indels within 5’ (1-420 bp) and 3’ (421-840 bp) of the 
coding region. Rates of insertions are plotted separately from deletions in 
Extended Data Fig. 6d. g, Model illustrating the mechanism of generation 
of indels in the vicinity of the collision site in co-directional and head-on 
orientations, via fork slippage (shown here), template switch or fork 
reversal (Extended Data Fig. 6e). Error bars indicate s.e.m. 


the strong contribution of replication-transcription conflicts to the 
generation of indels. 

We next analysed base substitutions in the coding sequence, which 
have been proposed to be generated by replication-transcription 
conflicts'!. In contrast to indels, base substitutions within the coding 
sequence were not enriched near locations of replication-transcription 
collisions (Fig. 3a—d). In addition, base substitution rates were similar 
under induced transcription compared with basal levels when consid- 
ering identical mutation target sites (Extended Data Fig. 8a). We again 
observed higher substitution rates in the coding sequence of head-on 
than co-directional genes”"', which are probably due to different 
replication fidelity between leading and lagging strands!*"’, although 
collisions cannot be ruled out as a source of these mutations. 

By contrast with coding sequence substitutions, promoter base sub- 
stitutions were elevated upon induction of transcription, suggesting 
that transcription initiation causes genome instability at the promoter 
(Fig. 3e and Extended Data Fig. 8b). Most strikingly, this increase in 
promoter substitution rates is much stronger (400%) for head-on than 
co-directional transcription, strongly suggesting that head-on colli- 
sions generate promoter substitutions. To examine the generality of 
this observation, we performed a genome-wide phylogenetic analysis 
to estimate nucleotide substitutions per site in promoters from multiple 
natural isolates of Bacillus. The analysis showed that promoters of 
head-on genes have higher nucleotide substitutions than promoters 
of co-directional genes (Fig. 3f). Thus, head-on transcription not 
only increases the mutation rate of the thyP3 promoter sequence, 
but increases the mutation rate on a genome-wide scale in natural 
populations. 


7 JULY 2016 | VOL 535 | NATURE | 179 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a Co-directional +IPTG b 


< ca 
Head-on -IPTG 


: Head-on +IPTG 


c Co-directional IPTG 


= } thyP3 Ge Ge — 


© 


=e etc, f 0.084 3 
3 =] []Other substitutions , ,, ais — 
wx 3 sek UG + ! 
£ J 1 = I 
56 3 3 0.044 ! 
= s [1Co-directional 8 = : ' | 
=o 2 [_] Head-on ene. 1 
BE 27 x 
ag 25 
a5 2 = 0.024 
a2 44 NS o8 = 
go j= ag 1 
° T —lL 
: | i 7 ees 
0 = A 
-IPTG +IPTG Leading Lagging 


Figure 3 | Head-on transcription induces base substitutions at the 
promoter. ad, Positional distribution of base substitutions in 
co-directional and head-on thyP3 under induced (+IPTG) and uninduced 
(—IPTG) conditions. Each dot records a base substitution mapped in a 
50-nucleotide window. e, The promoter base substitution rate is strongly 
increased in the head-on orientation upon IPTG induction. f, Mean 
nucleotide substitutions per site for each promoter, estimated pairwise 
among natural Bacillus isolates. Distribution for lagging-strand promoters 
(n= 32) compared with leading-strand promoters (n = 147). Nucleotide 
substitutions are also compared between promoters bound and not bound 
by transcriptional repressors (Extended Data Fig. 8c). Central mark of box 
plot represents median, edges are 25th and 75th centiles, notches are 95% 
confidence interval of median, and whiskers represent extreme data points 
within range. NS, not significant; *P < 0.05, **P< 0.01, ***P < 0.0001; 
Student’s t-test, Mann-Whitney U-test. Error bars indicate s.e.m. 


The most frequent substitution within the thyP3 promoter is at a con- 
served nucleotide in the —10 element recognized by the major o factor, 
T_, (Fig. 4a). T_7—C_, substitution accounted for all promoter substi- 
tutions and 50% of total mutation events upon induced transcription 
of head-on thyP3 (Fig. 3e). This enrichment is not due to competitive 
advantage of the C_7 mutant over wild type or other thyP3 mutants 
(Fig. 4b and Extended Data Fig. 5a-c), indicating that T_7—-C_7 isa 
bona fide mutation hotspot obtainable with our assay. Importantly, T_7 


is conserved across species and occurs in the promoters of ~50-70% 
of essential genes in B. subtilis and Escherichia coli*>. The possibility 
that these promoters are all susceptible to transcription-induced 
T—C mutagenesis implicates a previously unidentified, pervasive 
mechanism that can inactivate the transcription of many genes and 
result in loss of viability. Indeed, in E. coli T_;—+C_7 was observed as a 
mutation hotspot in the head-on orientation in a plasmid-based assay 
(Extended Data Fig. 8d)*°, and TC was also observed in other posi- 
tions of cis-regulatory elements beyond the —7 position”, suggesting 
that base substitutions in gene-regulatory elements is a signature of 
head-on transcription in bacteria. 

To examine the mechanism underlying this mutation, we used a 
restriction-enzyme-based assay that exclusively detects T_7—-C_7 
(Extended Data Fig. 8e) in thyP3 to test several alternatives. First, we 
found that the error-prone DNA polymerase PollV, which was pro- 
posed to be responsible for collision-induced substitutions”, is not 
a major contributor of this mutation (Fig. 4c). Second, T_7—+C_ is 
not generated by error-prone recombination repair as it still occurs 
frequently in the absence of recA (Extended Data Fig. 7e). Third, we 
examined whether a commonly occurring G-T wobble mismatch, 
which is generated by the replicative DNA polymerase and efficiently 
corrected by mismatch repair’’, accounts for this mutation. Inactivating 
mismatch repair increased the mutation rate of thyP3 by ~60-fold, 
similar to other mutation assays!8, and increased T—C substitutions 
at hotspots in the coding sequence by ~1,000-fold (Extended Data 
Fig. 8f, g). Notably, we did not find any T_7—-C_, substitutions upon 
screening ~1,000 mismatch repair mutants, suggesting that T_7—C_ 
is not generated via G-T mismatch. 

After ruling out these known models of mutagenesis, we propose a 
new model that explains the frequent T_7—>C_, substitutions on the 
basis of the structure of the bacterial promoter open complex where the 
—10 element is single-stranded”>”?”, Specifically, during transcription 
initiation, T_7 on the non-template strand is buried in a o-factor pocket, 
and its complementary base on the template strand (A_7) is unpaired 
and vulnerable to spontaneous deamination to hypoxanthine?” 
(Fig. 4d). Hypoxanthine can base pair with cytosine during replication, 
leading to the T_7—-C_7 mutation. This model is further supported by 
our data that treating cells with nitrous acid, an inducer of base deam- 
ination, leads to an increased frequency of T_;—C_, mutation, which 
is more pronounced in the hypoxanthine-DNA glycosylase mutant 
(Fig. 4e), supporting hypoxanthine as the premutagenic intermediate. 
The cellular adenine deaminase is not a major factor responsible for 


a 20 cP 1.10, d 
a a Promoter (C_,) mutant vs WT 
8 1.05- 
r= 
g @ 100+ - $4 3s 4-4-3. - NTS 
a 1-0 = TS 
o 
® 0954 ; ; 
=a o la, is deaminated to HX_, 
= AS SS 0.90 a 
0 0 20 40 60 80 
o 1 af Generations 
C64 oA e Be a 
e Co-directional = * NTS 
= 3 Head-on & 44 (Buffer treated TS 
2 gS s BB Nitrous acid treated 
£G oS 34 | pairs with C_, 
S22 =e y 
BS sx2] 
235 1 =| -H—— A 
=a l)— gS 4 —-— 
3 = 
50 aay 
io Wild type AyqjH Wild type AyxlJ 5! 


Figure 4 | Promoter T_7—>+C_, is a mutation hotspot generated via 
deamination. a, The consensus —10 element of B. subtilis SigA-dependent 
promoters (1 = 358). The strongly conserved T_7 is frequently mutated to 
aC. b, Fitness of head-on T_7—-C_; mutant relative to head-on wild-type 
(WT) thyP3 cells under induced transcription (mean + standard deviation 
(s.d.)). vs, versus. c, Mutation rate of T_7—C_, in yqjH mutant (error-prone 
polymerase PolIV). d, Model illustrating the mechanism of generation 

of T_;—C_y. During transcription initiation, the —10 element 


180 | NATURE | VOL 535 | 7 JULY 2016 


is single-stranded, T_7 on the non-template strand (NTS) is flipped into 
the o-binding pocket, creating solvent accessibility for A_7 on the template 
strand (TS), allowing it to be deaminated to hypoxanthine (HX). HX 

base pairs with C during replication, resulting in T_7—-C_7. e, T-7—-C_7 
frequencies in head-on thyP3 upon nitrous acid treatment of wild-type 
and AyxIJ (hypoxanthine-DNA glycosylase) strains. *P < 0.05; Student's 
t-test. Error bars indicate s.e.m. 


© 2016 Macmillan Publishers Limited. All rights reserved 


T_7—-C_7 mutation (Extended Data Fig. 8h), indicating that A_7 is 
spontaneously deaminated while sequestered within the transcription 
initiation complex. It is likely that other bases within the promoter 
open complex can also be mutated via deamination, although those 
mutations do not completely abolish gene expression and so cannot 
be identified by our assay. Our work thus uncovers a mechanism of 
promoter mutagenesis that implies a greater susceptibility of promoters 
to mutation than has previously been realised. 

Our proposed mechanism represents a novel mutagenesis pathway 
that is distinct from TAM", which introduces substitutions within the 
transcribed sequence via deamination on the non-template strand, 
while the template strand of the coding sequence is protected by base 
pairing with nascent RNA (that is, RNA-DNA hybrid). In contrast, 
the promoter is upstream of the transcription start site, and thus is not 
protected by RNA-DNA hybrid but remains vulnerable to deamination 
or other premutagenic DNA damage upon open complex formation 
(Fig. 4d). We propose a model that head-on replication interferes with 
RNA polymerase escape from the promoter, rendering the promoter 
open complex more susceptible to premutagenic DNA damage, sub- 
sequently leading to mutations. 

Our work reveals two types of collision-induced mutations, indels 
and promoter substitutions, which are generated by distinct mecha- 
nistic pathways probably resulting from mutual antagonism between 
replication and transcription upon collision. Furthermore, our work 
supports the hypothesis that collision-induced mutagenesis contributes 
to the evolution of the strong co-directional bias of essential genes® and 
reveals that orientation-biased promoter mutations underlie this con- 
served aspect of genome organization. We suspect that these mutation 
signatures have important implications not only for understanding the 
fitness and evolution of bacteria but also across domains of life including 
humans. Indels can lead to copy number variation, an important 
cause of genetic diseases. Mutations in cis-regulatory elements lead to 
misregulation of gene expression, and cis-regulatory elements are found 
to be more susceptible to mutagenesis than coding regions in eukar- 
yotic genomes’”. Thus, harmonizing replication with transcription is 
akey factor in fitness and genome evolution across domains of life. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 12 March 2015; accepted 11 May 2016. 
Published online 29 June 2016. 


1. French, S. Consequences of replication fork movement through transcription 
units in vivo. Science 258, 1362-1365 (1992). 

2. Liu, B. & Alberts, B. M. Head-on collision between a DNA replication apparatus 
and RNA polymerase transcription complex. Science 267, 1131-1137 (1995). 

3. Vilette, D., Ehrlich, S. D. & Michel, B. Transcription-induced deletions in 
Escherichia coli plasmids. Mol. Microbiol. 17, 493-504 (1995). 

4. Prado, F. & Aguilera, A. Impairment of replication fork progression mediates RNA 
polll transcription-associated recombination. EMBO J. 24, 1267-1276 (2005). 

5. Mirkin, E. V. & Mirkin, S. M. Mechanisms of transcription-replication collisions 
in bacteria. Mol. Cell. Biol. 25, 888-895 (2005). 

6. Pomerantz, R. T. & O’Donnell, M. The replisome uses MRNA as a primer after 
colliding with RNA polymerase. Nature 456, 762-766 (2008). 

7. Srivatsan, A., Tehranchi, A., MacAlpine, D. M. & Wang, J. D. Co-orientation of 
replication and transcription preserves genome integrity. PLoS Genet. 6, 
e1000810 (2010). 

8. Dutta, D., Shatalin, K., Epshtein, V., Gottesman, M. E. & Nudler, E. Linking RNA 
polymerase backtracking to genome instability in E. coli. Cell 146, 533-543 
(2011). 

9. Merrikh, H., Machon, C., Grainger, W. H., Grossman, A. D. & Soultanas, P. 
Co-directional replication-transcription conflicts lead to replication restart. 
Nature 470, 554-557 (2011). 

10. Kim, N. & Jinks-Robertson, S. Transcription as a source of genome instability. 
Nature Rev. Genet. 13, 204-214 (2012) 


LETTER 


11. Paul, S., Million-Weaver, S., Chattopadhyay, S., Sokurenko, E. & Merrikh, H. 
Accelerated gene evolution through replication-transcription conflicts. Nature 
495, 512-515 (2013). 

12. Fijalkowska, |. J., Jonczyk, P., Tkaczyk, M. M., Bialoskorska, M. & Schaaper, R. M. 
Unequal fidelity of leading strand and lagging strand DNA replication on the 
Escherichia coli chromosome. Proc. Natl Acad. Sci. USA 95, 10020-10025 
(1998). 

13. Bruand, C., Bidnenko, V. & Ehrlich, S. D. Replication mutations differentially 
enhance RecA-dependent and RecA-independent recombination between 
tandem repeats in Bacillus subtilis. Mol. Microbiol. 39, 1248-1258 (2001). 

14. Hastings, P. J., Lupski, J. R., Rosenberg, S. M. & Ira, G. Mechanisms of change in 
gene copy number. Nature Rev. Genet. 10, 551-564 (2009). 

15. Kunkel, T. A. Evolving views of DNA replication (in)fidelity. Cold Spring Harb. 
Symp. Quant. Biol. 74, 91-101 (2009). 

16. Rocha, E. P.C. & Danchin, A. Essentiality, not expressiveness, drives 
gene-strand bias in bacteria. Nature Genet. 34, 377-378 (2003). 

17. Reijns, M. A. M. et a/. Lagging-strand replication shapes the mutational 

andscape of the genome. Nature 518, 502-506 (2015). 

18. Schroeder, J. W., Hirst, W. G., Szewezyk, G. A. & Simmons, L. A. The effect of 

ocal sequence context on mutational bias of genes encoded on the leading 

and lagging strands. Curr. Biol. 26, 692-697 (2016). 

19. Million-Weaver, S. et al. An underlying mechanism for the increased 

mutagenesis of lagging-strand genes in Bacillus subtilis. Proc. Nat! Acad. 

Sci. USA 112, E1096-E1105 (2015). 

20. Luria, S. E. & Delbriick, M. Mutations of bacteria from virus sensitivity to 

virus resistance. Genetics 28, 491-511 (1943). 

21. Rosche, W. A. & Foster, P. L. Determining mutation rates in bacterial 

populations. Methods 20, 4-17 (2000). 

22. Mirkin, E. V., Castro Roa, D., Nudler, E. & Mirkin, S. M. Transcription regulatory 

elements are punctuation marks for DNA replication. Proc. Natl Acad. Sci. USA 

103, 7276-7281 (2006). 

23. Vilette, D., Uzest, M., Ehrlich, S. D. & Michel, B. DNA transcription and repressor 

binding affect deletion formation in Escherichia coli plasmids. EMBO J. 11, 

3629-3634 (1992). 

24. Tehranchi, A. K. et al. The transcription factor DksA prevents conflicts 

between DNA replication and transcription machinery. Ce// 141, 595-605 
(2010). 

25. Feklistov, A. & Darst, S. A. Structural basis for promoter-10 element 
recognition by the bacterial RNA polymerase o subunit. Cell 147, 1257-1269 
(2011). 

26. Yoshiyama, K., Higuchi, K., Matsumura, H. & Maki, H. Directionality of DNA 
replication fork movement strongly affects the generation of spontaneous 
mutations in Escherichia coli. J. Mol. Biol. 307, 1195-1206 (2001). 

27. Schaaper, R. M., Danforth, B. N. & Glickman, B. W. Mechanisms of spontaneous 
mutagenesis: an analysis of the spectrum of spontaneous mutation in the 
Escherichia coli lac! gene. J. Mol. Biol. 189, 273-284 (1986). 

28. Lu, A.L, Clark, S. & Modrich, P. Methyl-directed repair of DNA base-pair 
mismatches in vitro. Proc. Nat! Acad. Sci. USA 80, 4639-4643 (1983). 

29. Zhang, Y. et al. Structural basis of transcription initiation. Science 338, 
1076-1080 (2012). 

30. Zuo, Y. & Steitz, T. A. Crystal structures of the E. coli transcription initiation 
complexes with a complete bubble. Mol. Cell 58, 534-540 (2015). 


Acknowledgements We thank E. Robleto, R. Yasbin and L. Simmons for 
strains, M. Cox, R. Gourse, C. Hittinger, R. Landick, K. Wasserman, C. Gross, 
M. Laub, S. Rosenberg, L. Simmons and the Wang laboratory for discussions 
and comments on the manuscript. This work was supported by the National 
Institutes of Health Director’s New Innovator Award DP20D004433 to J.D.W. 


Author Contributions J.D.W. conceptualized the study. T.S.S., B.D.W. and 
J.D.W. designed the experiments. T.S.S. performed thyP3 fluctuation tests and 
sequencing of recA, yqjH, mutSL and adeC mutants, comparative genomic 
analyses, nitrous acid mutagenesis, competition assays and plating efficiency 
of mutants. B.D.W. developed the forward mutation assay, fluctuation tests 
and sequencing of wild-type strains, (RT-PCR, nalidixic acid fluctuation test, 
doubling time measurements and developed the restriction digest screening. 
Y.D. assisted the competition assay, plating efficiency and mutSL fluctuation 
tests. S.A.L. performed thyP3 fluctuation tests with B.D.W. T.S.S., B.D.W. and 
J.D.W. analysed the data and wrote the manuscript. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial interests. 
Readers are welcome to comment on the online version of the paper. 
Correspondence and requests for materials should be addressed to 

J.D.W. (wang@bact.wisc.edu). 


Reviewer Information Nature thanks S. Mirkin, E. Nudler and R. Pomerantz for 
their contribution to the peer review of this work. 


7 JULY 2016 | VOL 535 | NATURE | 181 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


METHODS 


Media and growth conditions. Unless otherwise indicated, cells were grown in S7 
defined medium?! containing 50 mM MOPS and supplemented with 1% glucose, 
0.1% glutamate, 40,.g ml“! tryptophan and 20j:gml“! thymine (Sigma-Aldrich) at 
37°C with vigorous shaking, and plated on solid medium (Spizizen’s medium), sup- 
plemented with 1% glucose, 0.1% glutamate, 401g ml! tryptophan and 20j.g ml"! 
thymine. Trimethoprim (RPI Research Products International) was added to plates 
at a final concentration of 51g ml"! for selecting loss-of-function mutations in 
thyP3 gene. To induce expression of thyP3, IPTG was added to the medium at a 
final concentration of 1 mM. No statistical methods were used to predetermine 
sample size. The experiments were not randomized. The investigators were not 
blinded to allocation during experiments and outcome assessment. 

Strain construction. Strains used are derivatives of the wild-type strain B. subtilis 
168 (JDW437) unless otherwise stated and are listed in Extended Data Table 1. The 
plasmids and PCR primers are listed in Extended Data Tables 1 and 2, respectively. 
The thyP3 strains were created in the AthyA (JDW1543) background. thyA 
was deleted using the markerless deletion method* with plasmid pJW395. The 
head-on thyP3 strain JDW 1544 was generated by transforming JDW1543 with lin- 
earized plasmid pJW396. The co-directional thyP3 strain JDW 1563 was generated 
by transforming JDW1543 with linearized plasmid pJW397. Swapped head-on and 
co-directional thyP3 strains JDW 1900 and JDW1901 were created by transforming 
JDW 1543 with linearized plasmid p}W430 and pJW431, respectively. The head-on 
thyP3 strain (JDW1176) ina AthyA AthyB background was created by transform- 
ing JOW942 with linearized plasmid pJW331. The lacZ reporter strains used in 
competition assays were created by transforming the respective thyP3 wild-type 
or mutant strains with linearized plasmid pJW417. 

Plasmid pJW395 was constructed to create a markerless deletion of thyA, 
by inserting thyA upstream homologous sequence (PCR amplified by primers 
0JW1052/oJW1053) and downstream homologous sequence (PCR amplified by 
primers oJW1054/oJW1055) between the EcoRI and BamHI sites of p)W299. 
Plasmid pJW331 was constructed by inserting the thyP3 gene between the SalI and 
SphI sites of pDR90. The thyP3 gene sequence, including its promoter, was ampli- 
fied from genomic DNA of JDW941 using oJW760/oJW761. Plasmid pJ W396 was 
constructed by inserting the thyP3 gene between the SalI and Sphl sites of pDR110. 
Plasmid pJW397 was constructed by excising out the Pspank-thyP3 region from 
pJW396 by double-restriction digest with EcoRI and SphI and replacing it with 
the P.pank-thyP3 sequence in the inverse orientation between the EcoRI and SphI 
sites. The Pspank-thyP3 sequence for inversion was amplified from pJW396 using 
primers oJW785 and oJW1137. 

Plasmid pJW430 was created by Gibson assembly*’ ofa DNA fragment contain- 
ing the lacI-Pspank-thyP3-spec sequences and the portion of the pDR110 plasmid 
backbone containing the plasmid replication origin, amp®, and amyE front (5’) 
and back (3’) homology sequences. The DNA fragment with lacl-Pspanx-thyP3-spec 
sequences was amplified from pJW397 using oJW1336/oJW 1339. The pDR110 
backbone fragment was amplified from pDR110 using oJW1337/oJW1338. 
Plasmid pJW431 was created in the same way as plasmid pJW430, except the 
DNA fragment with lacI-Pypank-thyP3-spec sequences was amplified from pJW396, 
instead of pJ)W397, using the same primers. Plasmid pJW417 was created by 
Gibson assembly** of a DNA fragment containing the spo VG-lacZ sequences and 
a portion of pDR110 plasmid containing the Pp, promoter and JacA locus 5’ and 
3’ homology sequences for integration. The DNA fragment containing the spo VG- 
lacZ sequence and plasmid backbone were amplified from pEX44 using 0JW1200/ 
oJW1201 and oJW1213/oJW1214, respectively. The Pyen promoter was amplified 
from pDR110 using oJW1202/oJW1203. The lacA 5’ and 3’ homology regions were 
amplified from the chromosomal DNA of B. subtilis 168 using oJW1215/oJW1199 
and oJW1204/oJW 1216, respectively. 

Deletion mutants of yqjH gene encoding PolIlV (JDW2266), adeC encoding 
adenine deaminase (JDW2501), recA encoding the recombinase RecA (JDW2288) 
and yxlJ encoding hypoxanthine-DNA glycosylase (JDW2284) were obtained from 
the Bacillus genetic stock centre (BGSC). Co-directional (JDW1563) and head-on 
thyP3 (JDW1544) strains were transformed with the genomic DNA of each mutant 
and were selected on erythromycin plates at 37 °C. Deletion of each gene was con- 
firmed by PCR (yqjH-oJW1900/1901; adeC-oJW 1904/1905; recA-oJ W2008/2009; 
yxIJ-oJW 1906/1907) and recA mutant was also tested for ultraviolet sensitivity. 
Deletion of mismatch repair genes mutS and mutL was created by transforming 
the genomic DNA of JDW1297 into co-directional (JDW1563) and head-on thyP3 
(JDW1544) strains and were selected on kanamycin plates at 37°C. The kana- 
mycin gene insertion inactivated both mismatch repair genes and insertion was 
confirmed by PCR (oJW1902/1903). 

Forward mutation fluctuation tests. Fluctuation tests were performed to measure 
the forward mutation rate. All the thyP3 strains were in the background AthyA 
thyB*. For each biological repeat, at least 30 parallel cultures of 0.1 ml in 96-well 
plates were set up for each strain at a dilution of 1 x 10-5 and grown at 37°C 


to OD600 nm = 0.4-0.6 in $7 minimal medium with 20,.gml~! thymine and with 
1mM IPTG (induced transcription) or without IPTG (uninduced transcription). 
Loss-of-function mutations in thyP3 genes confer resistance to trimethoprim 
(TMP). For selection of mutants, 0.1 ml of culture was plated on Spizizen’s minimal 
medium containing 20.gml! thymine, 1mM IPTG and 5yg mI’ trimethoprim. 
Plates were incubated at 45°C, and the number of trimethoprim-resistant colonies 
were counted at 48h (day 2) and 72h (day 3) of incubation. Serial dilutions of 
at least three cultures were plated on non-selective medium to determine the 
average c.f.u. The number of mutations per culture (m) was estimated using the 
Ma-Sandri-Sarkar maximum likelihood estimator (MSS-MLE) method through 
the Fluctuation AnaLysis CalculatOR (FALCOR) web tool™, and the mutation rate 
per cell per generation was calculated by m/(2 x N;), where N, is the average num- 
ber of cells across cultures in a fluctuation test?!. Fluctuation tests of the deletion 
mutants were performed as described for the wild-type strains earlier except for 
recA and mutSL deletion strains. Since recA mutant showed increased sensitivity 
to trimethoprim, selection of thyP3 mutants was done at 1g ml“! concentration 
of trimethoprim and mutant colonies were obtained from day 4 and day 5 after 
incubation. Fluctuation tests with mutSL mutants were performed identical to 
wild type, except that the cultures were diluted 1:20 for selection on trimethoprim 
plates, as inactivation of mismatch repair increases the mutation rate. The mean 
of mutation rates from n > 3 independent experiments was plotted with error 
bars representing standard error of the mean (s.e.m.). Statistical significance was 
calculated by paired Student's t-test of In(m) values”!. 

We employed a mutation assay for nalidixic acid resistance, which is con- 
ferred by mutations in the gyrA gene encoding DNA gyrase, to examine whether 
mutation rate is different outside the thyP3 locus between the co-directional and 
head-on thyP3 strains. For measurement of the mutation rate for nalidixic acid 
resistance (Nal), at least 30 parallel 1 ml cultures were grown in test tubes to 
OD600nm = 0.4-0.6 and entire cultures were plated on minimal medium containing 
201g ml“! thymine, 1 mM IPTG and 50,1gml! nalidixic acid (Sigma-Aldrich). 
Plates were incubated at 45°C for 48h, and the number of plates with no Nal® col- 
onies was counted. Serial dilutions were plated on non-selective medium to count 
the number of c.f.u. The number of Nal® mutations per culture (7) was estimated 
using the Py method and the mutation rate was calculated by m/(2 x N,)”1. Error 
bars represent the standard error from at least three independent experiments. 
Mutation spectra and rates of different mutations. To obtain the mutation spec- 
trum, genomic DNA from one colony per selective plate was extracted by using the 
prepGEM Bacteria kit (Zygem) and thyP3 was PCR amplified and sequenced using 
primers oJW1013 and oJW1335. The rate of individual mutation was determined 
by multiplying the total mutation rate by the proportion of different mutations in 
the mutation spectra as described*. Statistical significance of differences between 
co-directional and head-on strains for different mutation types was obtained using 
Student's t-test. 
qRT-PCR. Measurement of thyP3 transcription levels was performed by qRT- 
PCR. Cultures were grown in minimal media with 201:g ml“! thymine, with or 
without 1mM IPTG, to OD¢00 nm 0.4-0.6. RNA was isolated using the Qiagen 
RNeasy kit and reverse-transcribed using SuperScript III reverse transcriptase 
(Life Technologies). Real-time PCR was performed using SYBR green master mix 
(Applied Biosystems) with primers oJW1217/oJW1218 for amplifying the begin- 
ning of thyP3. The accA gene transcript amplified with primers 0JW1221/oJW1222 
was used as an internal control°®. 

Competition assay. Competition experiments were performed between strains 
carrying the wild-type and mutant thyP3. Strains were grown in S7 minimal 
medium supplemented with 1% glucose, 0.1% glutamate, 401g ml“! tryptophan 
and 1 mM IPTG. Strains in competition were distinguished by integrating a lacZ 
reporter gene at the JacA locus in the chromosome, enabling the competitors to 
be distinguished on 5-bromo-4-chloro-3-indolyl-3-p-galactopyranoside (X-gal) 
indicator plates in which LacZ” and LacZ* form white and blue colonies, respec- 
tively. The lacZ marker was swapped between the competing strains to negate any 
growth effect from the lacZ marker. Strains were preconditioned in the growth 
medium to saturation. Cultures were then mixed in a 1:1 ratio, and serial passage 
was performed with 1:1,000 dilutions (~10 generations per cycle) every 12h until 
70 generations. The ratio of mutant over wild type at each cycle was estimated by 
plating the serially diluted cultures on SPII minimal plates supplemented with 
401g ml! tryptophan, 20j1g ml’ thymine and 401g ml! X-gal at 37°C. Growth 
rate was calculated using the initial and final cell densities for each strain in the 
pair, and the relative fitness was calculated as the ratio of growth rates of mutant 
over wild-type cells’. Assays were performed with three independent replicates of 
the mutant tagged with lacZ and another three in which the wild type was tagged 
with lacZ. Relative fitness was then expressed as the mean + s.d. of replicates with 
and without the marker. To rule out reversion of the mutant thyP3 strain during 
competition growth, strains were plated at the end of 70 generations on X-gal 
indicator plates with and without thymine at 45°C and also on trimethoprim plates; 


© 2016 Macmillan Publishers Limited. All rights reserved 


only the wild type formed colonies on plates without thymine and the mutants 
formed colonies only on plates with thymine. As expected, wild type did not form 
colonies on trimethoprim, while the mutants formed colonies. 

Restriction digest screen of promoter mutation. To screen for T_7—+C_7 muta- 
tion in the promoter, the first half of the thyP3 fragment including the promoter 
region was PCR amplified with primers oJW1335 and oJW1011 from mutant DNA. 
The PCR fragment was digested with AflIII enzyme (NEB) and digested prod- 
ucts were analysed in 1.5% agarose gel. PCR fragments containing the T_7—C_ 
mutation are digested by AflII, whereas wild-type fragments are not digested 
(Extended Data Fig. 8e). 

Sequence logo of the — 10 element of SigA-dependent promoters. We obtained 
the sequences of the —10 element of all experimentally validated SigA-dependent 
promoters (= 358) available at the DBTBS database*” and used the WebLogo 
tool** to generate the consensus motif with the default parameters to show the 
genome-wide conservation of the —10 element. 

Comparative genomic and molecular evolutionary analyses. For comparative 
genomic and evolutionary analyses, we used the completed genomes of eight 
strains of B. subtilis and one B. amyloliquefaciens strain, a close relative of B. subtilis. 
The analysed genomes are listed in Extended Data Table 2. Complete genomes, 
amino acid and nucleotide sequences of genes and intergenic sequences, and gene 
annotation information were downloaded from the Integrated Microbial Genomes 
(IMG) database’. Core genes from B. subtilis and B. amyloliquefaciens were iden- 
tified by standard all-against-all reciprocal best-hit method using BLASTP. Best 
bidirectional hits were considered when the alignment had >85% identity with 
85% coverage length at an e-value cut-off of 10 7°. We eliminated any gene anno- 
tated as pseudogene and containing ambiguous nucleotides from the analysis. 

To assign genes to leading and lagging strand, we obtained the sequence coor- 
dinates of oriC and dif sites from the DoriC database“ for each genome, and using 
these coordinates in combination with transcript orientation information from the 
genome annotation files (plus or minus strand), genes were assigned to leading 
and lagging strands. All genes analysed were present on the same strand (either 
leading or lagging) in all the genomes analysed. 

To extract promoter sequences, experimentally validated promoter annota- 
tions were obtained for the core genes of B. subtilis strain 168 from the DBTBS 
database*”. Sequence encompassing the transcription start site (+1), the —10 and 
—35 elements of the promoter was obtained. Using these promoter sequences 
as references, homologous promoters from the other genomes of B. subtilis and 
B. amyloliquefaciens were obtained using the blastn-short algorithm of BLASTN 
employing the 75% identity over 80% alignment coverage with e-value less than 
10°. We obtained 179 promoters (147 and 32 for leading- and lagging-strand 
genes, respectively). 

The amino acid sequences and the corresponding nucleotide sequences of 
protein-coding core genes were aligned using the G-INS-i algorithm of the MAFFT 
alignment program (v.7.012b)*". Furthermore, to produce high-quality alignments, 
we used the PAL2NAL program (v.12.1)*”, which produces codon-based align- 
ments from aligned protein sequences and the corresponding DNA sequences. 
Additionally, PALZ2NAL reports whether the protein and nucleotide sequences 
have mismatches or in-frame stop codons. The codon-based alignments of the 
core genes generated by PAL2NAL did not contain any mismatches or in-frame 
stop codons, which ensured the high quality of the alignments. 

For aligning the promoters, we used the E-INS-i algorithm of the MAFFT align- 
ment program, which is optimized for aligning highly conserved motifs inter- 
spaced between weakly conserved regions. The alignments of the experimentally 
validated promoters were manually inspected for any misalignments. 
Estimation of nucleotide substitutions in promoters. To estimate nucleotide 
substitutions in promoters, we first constructed phylogeny using the concatenated 
sequence of the core genome genes, that is, genes present in all the analysed genomes. 
The aligned nucleotide sequences were concatenated to create a single sequence for 
each analysed strain. Phylogeny was constructed using PhyML program* with 500 
bootstrap replicates. The substitution model used was the general time reversible 
model (GTR) with discrete gamma model, and gamma parameter was estimated. 

For each promoter, substitutions were estimated by pairwise comparison of the 
different strains using the baseml program of PAML package“. Basem] program 
uses a maximum likelihood approach to estimate nucleotide substitutions, based 
on an input phylogenetic tree. We used the maximum likelihood phylogenetic tree 
generated earlier and the substitution model was GTR. The rest of the parameters 
were default. Then for each promoter, mean substitutions per site were calculated 
and the distribution of mean pairwise substitution rates was compared between 
leading- and lagging-strand promoters. Mann-Whitney U-test was used to deter- 
mine statistical significance. 

For comparing the mutation rates between promoters with and without tran- 
scription factor binding, we used the population genetic parameter Watterson’s 


LETTER 


estimator of theta (@w). Since theta (@w) is a population genetic parameter, it is well 
suited for analysing within-species sequence polymorphism and thus Ow serves as a 
proxy for mutation rate of a given promoter. We calculated w for the total number 
of mutations in the high-quality sequence alignment for each promoter across 
the eight strains of B. subtilis (Extended Data Table 2) using the DnaSP software 
(v.5)*. Promoters with experimentally validated transcription factor binding were 
obtained from the DBTBS database”. Sequence covering the +1 site, —10 and 
—35 elements, which includes the transcription-factor-binding site, were used for 
constructing the alignment as described earlier. A total of 33 different transcription 
factors that are experimentally validated in B. subtilis were used (Extended Data 
Table 2). Mann-Whitney U-test was used to determine statistical significance. 
Nitrous acid mutagenesis. Nitrous acid is known to strongly deaminate purines 
and pyrimidines in DNA. Adenine is deaminated to hypoxanthine“ that produces 
A:T to G:C transition. We subjected the wild-type and yxlJ (encoding hypoxan- 
thine-DNA glycosylase)*” mutant strains carrying the head-on thyP3 reporter 
under induced transcription to nitrous acid treatment following the protocol 
reported previously**. Briefly, cells were grown in 5 ml of S7 minimal medium 
with 20j:gml! thymine, 401g ml“! tryptophan and with 1 mM IPTG for 12h to 
saturation. To the saturated cultures 1 ml of 8.7 M NaNO, (nitrous acid dissolved 
in sodium acetate buffer pH 4.6) (Sigma-Aldrich) was added and incubated at 
room temperature for 60 min. As a control, cells were treated with the sodium 
acetate buffer in parallel. Cells were then spun down, washed and re-suspended 
in the growth medium and 1 ml of culture was used for determining the c.f.u. and 
the rest of the culture was plated on minimal plates supplemented with 20:g ml"! 
thymine, 40}1g ml! tryptophan, 1 mM IPTG and 5yg ml! trimethoprim for selecting 
trimethoprim-resistance mutants. The same was performed for buffer-treated cells 
except that 0.1 ml of culture was used to determine c.f-u. and 0.1 ml was plated for 
selecting trimethoprim-resistant colonies. After 2 days of incubation, trimethoprim- 
resistance colonies appeared, and as described before the thyP3 gene was PCR 
amplified and screened for T_7—C_7 mutation. Mutation frequency was calculated 
by dividing the number of trimethoprim-resistant colonies by number of colonies 
on nonselective plate. Experiment was done in triplicate and error bars represent 
s.e.m. Statistical significance was obtained using Student’s t-test. 


31. Vasantha, N. & Freese, E. Enzyme changes during Bacillus subtilis sporulation 
caused by deprivation of guanine nucleotides. J. Bacteriol. 144, 1119-1125 
(1980). 

32. Janes, B. K. & Stibitz, S. Routine markerless gene replacement in Bacillus 
anthracis. Infect. Immun. 74, 1949-1953 (2006). 

33. Gibson, D. G. et al. Enzymatic assembly of DNA molecules up to several 
hundred kilobases. Nature Methods 6, 343-345 (2009). 

34. Hall, B. M., Ma, C.-X., Liang, P. & Singh, K. K. Fluctuation analysis CalculatOR: 

a web tool for the determination of mutation rate using Luria-Delbruck 
fluctuation analysis. Bioinformatics 25, 1564-1565 (2009). 

35. Lippert, M. J. et al. Role for topoisomerase 1 in transcription-associated 
mutagenesis in yeast. Proc. Natl Acad. Sci. USA 108, 698-703 (2011). 

36. Ter Beek, A. et al. Transcriptome analysis of sorbic acid-stressed Bacillus subtilis 
reveals a nutrient limitation response and indicates plasma membrane 
remodeling. J. Bacteriol. 190, 1751-1761 (2008). 

37. Sierro, N., Makita, Y., de Hoon, M. & Nakai, K. DBTBS: a database of 
transcriptional regulation in Bacillus subtilis containing upstream intergenic 
conservation information. Nucleic Acids Res. 36, D93-D96 (2008). 

38. Crooks, G. E., Hon, G., Chandonia, J.-M. & Brenner, S. E. WebLogo: a sequence 
logo generator. Genome Res. 14, 1188-1190 (2004). 

39. Markowitz, V. M. et al. IMG: the Integrated Microbial Genomes database and 
comparative analysis system. Nucleic Acids Res. 40, D115-D122 (2012). 

AO. Gao, F. & Zhang, C.-T. DoriC: a database of oriC regions in bacterial genomes. 
Bioinformatics 23, 1866-1867 (2007). 

Al. Katoh, K. & Toh, H. Parallelization of the MAFFT multiple sequence alignment 
program. Bioinformatics 26, 1899-1900 (2010). 

42. Suyama, M., Torrents, D. & Bork, P. PAL2NAL: robust conversion of protein 
sequence alignments into the corresponding codon alignments. Nucleic Acids 
Res. 34, W609-W612 (2006). 

43. Guindon, S. et al. New algorithms and methods to estimate maximum- 
likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 
307-321 (2010). 

44. Yang, Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 
24, 1586-1591 (2007). 

45. Librado, P. & Rozas, J. DnaSP v5: a software for comprehensive analysis of 
DNA polymorphism data. Bioinformatics 25, 1451-1452 (2009). 

46. Lindahl, T. Instability and decay of the primary structure of DNA. Nature 362, 
709-715 (1993). 

47. Aamodt, R. M., Falnes, P. @., Johansen, R. F., Seeberg, E. & Bjaras, M. 

The Bacillus subtilis counterpart of the mammalian 3-methyladenine DNA 
glycosylase has hypoxanthine and 1,N®-ethenoadenine as preferred 
substrates. J. Biol. Chem. 279, 13601-13606 (2004). 

48. Zamenhof, S. Gene unstabilization induced by heat and by nitrous acid. 

J. Bacteriol. 81, 111-117 (1961). 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 
- Thymine + Thymine + Trimethoprim 
Thymidylate 
synthetase 
(encoded by thy) 
4 Trimethoprim 
Thymine 
Genotype '- Thymine '+ Thymine'+ Thymine 
H H i+ Trimethoprim 
thyP3+, + i + i - 
thyP3-! = is thyP3 thyPS 
b 
1.10 1.10: 1.10: 
missense mutant vs WT nonsense mutant vs WT 
% 1.05 & 1.05 & 1.05 
© o o 
& & r= 
8 1.00 1.00 $ 1.00: 
& 6 & 
2 2 2 
0.95 © 0.95 0.95 
0.90 0.90 0.90 
0 20 40 60 80 0 20 40 60 80 
Generations Generations Generations 
c d 
5 120 
37°C [45°C 
4 100 
3 © 80 
x 3 i 
a oO 
g = 60 
a D 
£ £ 
zp ? 3 
S) xe} 40 
1 
20 
0 0 : : 
deletion frameshift 
thyP3* thyP3 thyP3* thyP3 WT mutant mutant 
Co-directional Head-on 


Extended Data Figure 1 | Development of a forward mutation assay that 
detects loss-of-function mutations in B. subtilis. a, Simplified diagram 
of thymidine monophosphate (dTMP) synthesis. The phage-encoded 
thyP3 encodes thymidylate synthetase, which synthesizes dTMP and 
dihydrofolate (DHF) from dUMP and tetrahydrofolate (THF). DHF is 
recycled back to THF by dihydrofolate reductase (DHFR). Trimethoprim 
inhibits DHFR, thus blocking recycling of the essential cofactor THF and 
available THF is depleted by active thymidylate synthetase and cell growth 
is inhibited. Because cells with active thymidylate synthetase rely solely on 
endogenous dTMP synthesis, thyP3* cells are sensitive to trimethoprim 
and loss-of-function mutations in thyP3 lead to trimethoprim resistance, 
which is the basis for the forward mutation assay. Viabilities of wild-type 
(thyA* thyB‘), thyP3* (AthyA thyB‘ thyP3*) and thyP3~ (AthyA thyB‘s 
thyP3~) cells are shown in the table and representative colonies (at 45 °C) 
are shown on the right. b, Competition between strains carrying wild-type 


(WT) and mutant thyP3 ina AthyA thyB‘ background to determine if 
there is any selective pressure on different mutants during growth phase 
at permissive temperature (37 °C). Relative fitness (mean +s.d.) of six 
replicates is shown. c, Shifting the temperature to 45°C does not affect 
plating efficiency during selection for trimethoprim resistance. Wild-type 
and mutant thyP3 cells were grown at 37°C and plated on solid medium 
supplemented with IPTG plus thymine at 37°C and 45°C, and colony- 
forming units (c.f.u.) ml~! optical density (ODs00 nm)! was determined. 
Mean + s.d. of three replicates is shown. d, thyP3 mutants have growth 
defects without thyB'. The doubling times of thyP3 mutant (a deletion 
and a frameshift mutant) in the AthyA AthyB background at 37°C are 
longer, indicative of growth defects in the absence of the backup gene thyB. 
Mean + s.d. of three replicates is shown. For b and d, the mutant strains 
are listed in Extended Data Table 1. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a b c 
2.5 oe 15 NalR Replication i p ‘ Transcription 
= 20 oS a > romoter | Terminator 
OZ: = mutation rate 
Xx 2 Rad co-directional co-directional - swapped 
= 15 paige ' 1 1 : 
g s2 a ‘S| a! nN 
® = © y 4 ay | a | y 4 
a 1.0 So : ! ' I 
= 
z 05 = S head-on head-on - swapped 
—& 7 = 
8 >! wa | a iw 
00 8 >a, || D: > 
_ CO Head-on Co- —_ Head-on : ; ; ‘ 
directional directional 
d e 
85 6 4 LJ Co-directional 
ae Head-on 
= 64 im Promoter 
= <= base substitutions 
ox @ x 
Ls Ls 
fe Ge 
fo gS c 2 
Oe 44 oO 
gS so 
= 6 26 
= 2 = 2 Intragenic 
8 2 & base substitutions 
& 24 & 
Indels 


Head-on- 
swapped 


Co-directional- 
swapped 


-IPTG 


Extended Data Figure 2 | Expression level and mutation rate of thyP3. 
a, thyP3 expression in co-directional and head-on orientations. Using real- 
time quantitative PCR, messenger RNA level of thyP3 in the co-directional 
and head-on strains under induced (+ IPTG) condition was measured 

and normalized to the reference gene accA. Since level of expression is 
similar between the strains, the observed difference in thyP3 mutation rate 
between co-directional and head-on orientations (Fig. 1d) is not caused by 
intrinsic differences in the expression level of thyP3. b, The orientation- 
specific difference in thyP3 mutation rate is not due to global increase of 
mutagenesis in the head-on strain. As a control to show that the increase 
in mutation rate is local to the thyP3 reporter, we measured the mutation 
rate for resistance to nalidixic acid (Nal®, conferred by mutations in the 
gyrA gene) in co-directional and head-on strains. Since the Nal® mutation 
rates in the two strains were similar, we conclude that the observed 
increase in head-on mutation rate is specific to thyP3 gene. c, Schematics 
of the co-directional and head-on thyP3 constructs (left) and an additional 
control to examine the effect of the genomic context on thyP3 mutagenesis, 
the neighbouring genes were swapped (right). In each construct, the thyP3 


+IPTG 


gene is flanked by the Jacl gene and the spectinomycin-resistance gene. 
The reporter constructs were integrated into the chromosome at the amyE 
locus by double crossover. The direction of replication is shown at the top. 
The co-directional-swapped strain was created by inverting the 
lacI-thyP3-spc from the head-on strain and the head-on-swapped 
construct was created by inverting the same construct from the 
co-directional strain. Thus the swapped constructs switch the 
neighbouring transcription units. The dotted lines in each construct 

show the swapping boundary. d, The mutation rate of the swapped 
head-on strain is still higher than the swapped co-directional strain 

when transcription is induced (+IPTG), indicating that the difference 

in mutation rate between reporter strains is not due to the direction of 
thyP3 relative to its neighbouring genes. e, Mutation rate of co-directional 
and head-on thyP3 under uninduced (—IPTG) and induced (+IPTG) 
transcription. The rate of each class of mutations obtained under each 
condition is also depicted within each bar. For b, d and e, mean +s.e.m. of 
n> 3 independent experiments is shown. **P < 0.01, Student's t-test. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a -100 


101 


201 


301 


401 


501 


601 


701 


801 


101 


201 


301 


401 


501 


601 


701 


801 


Extended Data Figure 3 | Mutation spectra of thyP3 under induced 
transcription. a, b, Illustrations of the mutation spectra of the thyP3 
mutants obtained from fluctuation tests of co-directional (n = 214) (a) and 
head-on (= 232) (b) strains when transcription is induced (+IPTG). The 
thyP3 coding sequence with its promoter is shown. Sequence coordinates 


CAGCAATGGCAAGAACGTTGCTCGA 
AATTGTGAGCGGATAACAATTAAGC 
oe 
ATAATGGAATCTCAGACGAAGAGTT 

y 
CGACAACTCAGAGGTT CCGAT A 
——— 


ACTGAATTGAACAAAATGGGCGTAC 


T 
GAAGTCTAAATGGAGAAAAAGT GGA 


G 
G cx54 


GGGTAAATGTGAGCACTCACAATTC ATTTTGCAAAAGTTGEMGAGTT TAT CTACAAGGTGTGGGAMAAMGT GT GG 


T A A 
TTAGTCGACAABAAGGACTGAGAAA CAMGACTCAATTCGATAAACAATAC AATTCAATTATAAAGGATATTATCA 


s 
S A 


T 
T 


x7 x10 


Vv x A 
TGATGTAAGAACCAAGTGGGACTCA GATGGAACGCCGGCACATACTCTAA GTGTAATGAGTAAGCAAATGAGATT 
EEE ee ee ee 


A A 
ACGACAAAAAAGGTTGCCTGGAAAA 
A 
ATA GGGATCAGTGGAAACAAGA 


GT 
TCAGGTAGACTATCTTCTTCATCAA 


CAGCCATTAAAGAGTTACTCTGGAT 


AGACGGCACCATCGGACATGCATAT 


TTGAAGAACAACCCGTCTTCACGCA 


x8 
+t 
Cc 
A 
A 
TATX4 G 


A 
TTGGCAGCTGAAATCGAACGATGTT 


Gi 3 
GGATTTCAGCTGGGGAAGAAAAACA 


A G 
GACACATTACAATGCTGTGGAATCC 
— 


cT 
GA AA T T tT A 
TGATGATTTAGACGCAATGGCCTTA ACGCCATGTGTATACGAGACACAAT GGTATGTTAAGCAAGGTAAGCTCCA CCTTGAGGTAAGAGCACGGAGCAAT 


————————— 


A A G G A G a 


GACATGGCGTTGGGGAATCCATTCA ATGTATTCCAGTACAATGTGTTGCA GCGCATGATTGCTCAAGTGACTGGT TATGAGCTTGGTGAATATATCTTTA 


G Cc T TG a 
ACATTGGGGATTGCCATGTGTACAC ACGTCATATAGACAATTTGAAAATC CAGATGGAAAGAGAACAG GAAG CACCTGAACTATGGATCAATCCTGA 


A 
r pi al 
c oA Uf T Gt 


Av 
AGTGAAAGATTTTTATAACTTTACC GTTGATGATTTCAAATTAATCAACT ATAAACATGGGGACAAGCT ATT TGAGGTAGCGGTTRAAT GCTGCCTC 


cx114 


CAGCAATGGCAAGAACGTTGCTCGA GGGTAAATGTGAGCACTCACAATTC ATTTTGCAAAAGTTGITGACTTTAT CTACAAGGTGTGGGAMAAMGTGTGG 


x 
a 
A T 
AATTGTGAGCGGATAACAATTAAGC TTAGTCGACAAAAAGGACTGAGAAA CAIIGACTCAATTCGATAAACAATAC AATTCAATTATAAAGGATATTATCA 
— x3 
€ ” 
CA @ 


iz 
A i 
ATAATGGAATCTCAGACGAAGAGTT TGATGTAAGAACCAAGTGGGACTCA GATGGAACGCCGGCACATACTCTAA GTGTAATGAGTAAGCAAATGAGATT 


OOO LL 
+ 
A A 
T A A A GA A 
CGACAACTCAGAGGTTCCGATTTTA ACGACAAAAAAGGTTGCCTGGAAAA CAGCCATTAAAGAGTTACTCTGGAT TTGGCAGCTGAAATCGAACGATGTT 
icc ediateedPalteliiathechne daft Dae 
——— 
cx3 


A CA AA A 
ACTGAATTGAACAAAATGGGCGTAC ATA GGGATCAGTGGAAACAAGA AGACGGCACCATCGGACATGCATAT GGA CAGCTGGGGAAGAAAAACA 


G +g 


Tt 1 A Cc Ax 
GAAGTCTAAATGGAGAAAAAGTGGA TCAGGTAGACTATCTTCTTCATCAA TTGAAGAACAACCCGTCTTCACGCA GACACATTACAATGCTGTGGAATCC 


SYS 


A 
A A A Cr c A < T A A A 
TGATGATTTAGACGCAATGGCCTTA ACGCCATGTGTATACGAGACACAAT GGTATGTTAAGCAAGGTAAGCTCCA CCTTGAGGTAAGAGCACGGAGCAAT 
DS eee ns 


v7 
A ese 
A G CA r G (a Te 

GACATGGCGTTGGGGAATCCATTCA ATGTATTCCAGTACAATGTGTTGCA GCGCATGATTGCTCAAGTGACTGGT TATGAGCTTGGTGAATATATCTTTA 
——————————— ey 


——— 
x7 A 
TAx2 CAA T A 


ACATTGGGGATTGCCATGTGTACAC ACGTCATATAGACAATTTGAAAATC CAGATGGAAAGAGAACAG GAAG CACCTGAACTATGGATCAATCCTGA 
—— 


G A 
. A 


Cc G Cc 
AGTGAAAGA’ PATAACTTTACC GTTGATGA CAAATTAATCAACT ATAAACATGGGGACAAGC ATT TGAGGTAGCGGTTMAATGCTGCCTC 


—— 22insertion vy 1 nt insertion —— deletion/inversion 


22 deletion A 1 ntdeletion 35 S10 SD Start Stop 


used to represent different mutations are shown at the bottom, and 


highlighted in each spectrum. 


are indicated with reference to +1 transcription start site. The symbols 


© 2016 Macmillan Publishers Limited. All rights reserved 


base substitutions are shown in blue above the sequence. The numbers 
marked in orange next to a mutation denote the frequency. The promoter 
elements, Shine-Dalgarno (SD) sequence, and start and stop codons are 


Gx5_Cx16 
a -100 CAGCAATGGCAAGAACGTTGCTCGA GGGTAAATGTGAGCACTCACAATTC A GCAAAAGTTGIMGACTT TAT CTACAAGGTGTGGGAMABGTGT GG 
x12 
T 1 A 
1 AATTGTGAGCGGATAACAATTAAGC TTAGTCGACAAABAGGAGTGAGAAA CATGACTCAATTCGATAAACAATAC AATTCAATTATAAAGGATAT TATCA 
T T 
T TT GTA T A 
101 ATAATGGAATCTCAGACGAAGAGTT TGATGTAAGAACCAAGTGGGACTCA GATGGAACGCCGGCACATACTCTAA GTGTAATGAGTAAGCAAATGAGATT 
T 
A 
A ic AA TTT T TAA A A 
201 CGACAACTCAGAGGTT CCGA’ A ACGACAAAAAAGGTTGCCTGGAAAA CAGCCATTAAAGAGTTACTCTGGAT TTGGCAGCTGAAATCGAACGATGTT 
———— eal 
c TG T G AT 
301 ACTGAATTGAACAAAATGGGCGTAC ATATTTGGGATCAGTGGAAACAAGA AGACGGCACCATCGGACATGCATAT GGATTTCAGCTGGGGAAGAAAAACA 
——$—$—— 
a 
a Cc G A 
Aa A T al Ax4 T G 
401 GAAGTCTAAATGGAGAAAAAGTGGA TCAGGTAGACTATCTTCTTCATCAA TTGAAGAACAACCCGTCTTCACGCA GACACATTACAATGCTGTGGAATCC 
A 
t A ig 
A AT G A TF CT, FT. '€ 3 
501 TGATGATTTAGACGCAATGGCCTTA ACGCCATGTGTATACGAGACACAAT GGTATGTTAAGCAAGGTAAGCTCCA CCTTGAGGTAAGAGCACGGAGCAAT 
ee ee 
A 
A TA T c G AA 
601 GACATGGCGTTGGGGAATCCATTCA ATGTATTCCAGTACAATGTGTTGCA GCGCATGATTGCTCAAGTGACTGGT TATGAGCTTGGTGAATATATCTTTA 
= ——— 
eee 
A 
A 
A Ta € A A 
701 ACATTGGGGATTGCCATGTGTACAC ACGTCATATAGACAATTTGAAAATC CAGATGGAAAGAGAACAGTTTGAAG CACCTGAACTATGGATCAATCCTGA 
——— 
en 
Tx 
T Vv T T cC_A AT_A 
801 AGTGAAAGATTTTTATAACTTTACC GTTGATGATTTCAAATTAATCAACT ATAAACATGGGGACAAGCTTTTATT TGAGGTAGCGGTTMAATGCTGCCTC 
A G 
A TT. .€x20 
b -100 CAGCAATGGCAAGAACGTTGCTCGA GGGTAAATGTGAGCACTCACAATTC ATTTTGCAAAAGTTGIMGACTTITAT CTACAAGGTGTGGGAMAAIGTGTG G 
x8 
TCx2 
ay 1 
AAx3 TA 
J AATTGTGAGCGGATAACAATTAAGC TTAGTCGACAAAAAGGAICTGAGAAA CRINGACTCAATTCGATAAACAASSC AATTCAATTATAAAGGATATTATCA 
_eoQeQeeeEeEEeEeeEEE——ESEEEE—E—Eeee 
GA _——_—_ 
CA A 
T T cA A 
101 ATAATGGAATCTCAGACGAAGAGTT TGATGTAAGAACCAAGTGGGACTCA GATGGAACGCCGGCACATACTCTAA GTGTAATGAGTAAGCAAATGAGAT T 
rn x3 
T v TA CAx3 
CoG G ccT TA TCTX2 
201 CGACAACTCAGAGGTTCCGATTTTA ACGACAAAAAAGGTTGCCTGGAAAA CAGCCATTAAAGAGTTACTCTGGAT TTGGCAGCTGAAATCGAACGATGT T 
———EeEEE—————— 
(S 
7. (Gil AF ic Cc oA AC A 
301 ACTGAATTGAACAAAATGGGCGTAC ATATTTGGGATCAGTGGAAACAAGA AGACGGCACCATCGGACATGCATAT GGATTTCAGCTGGGGAAGAAAAAC A 
ae | 
A T 
mN TA A A Ax4 A G 
401 GAAGTCTAAATGGAGAAAAAGTGGA TCAGGTAGACTATCTTCTTCATCAA TTGAAGAACAACCCGTCTTCACGCA GACACATTACAATGCTGTGGAATC C 
ee 
i‘ A , 
A A G x OT [a A A 
501 TGATGA AGACGCAATGGCCTTA ACGCCATGTGTATACGAGACACAAT GGTATGTTAAGCAAGGTAAGCTCCA CCTTGAGGTAAGAGCACGGAGCAA T 
A % c 
A A TC AA T A A AA 
601 GACATGGCGTTGGGGAATCCATTCA ATGTATTCCAGTACAATGTGTTGCA GCGCATGATTGCTCAAGTGACTGGT TATGAGCTTGGTGAATATATCTTT A 
T A . 
T:. 7 Cc T A ag T A AA 
701 ACATTGGGGATTGCCATGTGTACAC ACGTCATATAGACAATTTGAAAATC CAGATGGAAAGAGAACAGTTTGAAG CACCTGAACTATGGATCAATCCTGA 
x5 
7 T A GAT A 
801 AGTGAAAGATTTTTATAACTTTACC GTTGATGATTTCAAATTAATCAACT ATAAACATGGGGACAAGCTTTTATT TGAGGTAGCGGTT BAIT GCTGCCT C 


22 insertion Y 1 nt insertion 


22 deletion A 1ntdeletion 35 M0 SD Start Stop 


LETTER 


Extended Data Figure 4 | Mutation spectra of thyP3 under uninduced 
transcription. a, b, Illustrations of the mutation spectra of the thyP3 
mutants obtained from fluctuation tests of co-directional (n = 163) (a) 
and head-on (n= 178) (b) strains when transcription is not induced 
(—IPTG). The thyP3 coding sequence with its promoter is shown. 
Sequence coordinates are indicated with reference to +1 transcription 


start site. The symbols used to represent different mutations are shown at 
the bottom, and base substitutions are shown in blue above the sequence. 
The numbers marked in orange next to a mutation denote the frequency. 
The promoter elements, Shine-Dalgarno (SD) sequence, and start and 
stop codons are highlighted in each spectrum. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Relative fitness 


Extended Data Figure 5 | Absence of selection bias in thyP3 forward 
mutation assay. a—c, Growth competition experiments were performed 
between the C_; promoter mutant against the following mutants: missense 
mutant (a), nonsense mutant (b) and frameshift mutant (c). Each mutant 
was competed against the C_7 promoter mutant to check if there is a 
competitive disadvantage for a mutant that has a mutation within the 
coding sequence, which may explain the high frequency of C_7 mutation 
compared to other mutations. The results show no fitness disadvantage 

for any of the mutants tested, suggesting that the high frequency of 

C_; promoter mutation is not due to a selection bias. Mean 4 


C_,vs missense mutant 


CFU/mL/OD (x108) 


T T T 
40 60 


Generations 


80 


* 
oF os 
oo or o™ ae" 


On ow” 


Relative fitness 


1.10 
C_, vs nonsense mutant 
1.055 
i ee 
0.95 4 


0 20 


40 
Generations 


60 80 


t s.d. of six 


© 2016 Macmillan Publishers Limited. All rights reserved 


Relative fitness 


1.10 
C_, vs frameshift mutant 
1.05 4 
1.00 ~~ -fasf—-—¢—F-—-F--f --- 
0.95 4 
0.90 1 1 1 
0 20 40 60 80 
Generations 


replicates is shown and mutants competed are indicated within the plot. 

d, Plating efficiency of different thyP3 mutants. Plating efficiency was 
determined to check whether different classes of thyP3 mutants have 
differences in their plating efficiency on trimethoprim-selection plates at 
45 °C, which may explain the variation in the mutation rates and spectrum. 
The result shows similar plating efficiency among the different mutants, 
suggesting that plating efficiency does not underlie the variation in the 
observed mutation rates. The different mutants tested are indicated on 

the x axis. Mean + s.d. of three replicates is shown. The mutant strains are 
listed in the Extended Data Table 1. 


a 
Z GTGTGGcaraarGTGTGG TaAAACAATACAAT aa 
WT (-18)cacaccerarracacacc (+1) (+66) . merrarertacr: (77 
{ deletion { duplication 
Mutant GTGTGG TaAAACAATACAATACAATcaa 
CACACC arrTGTTATGTTATGTTAcrr 
c 
Co-directional Head-on 
Ee 
a 
E 
, peep jean 


fork stalling by 
3’ collision may cause 
5’ strand slippage 


collision may cause 
fork reversal that 

3’ is resolved through 
5’ recombination- 
mediated replication 
restart 


indel 
generation 


Template switching 


3’ polymerase may switch 
5’ template upon collision 


Extended Data Figure 6 | Mechanism of indel generation. a, 
Representative deletion and duplication events in thyP3. A high-frequency 
deletion and duplication event observed in the thyP3 gene in co-directional 
and head-on strains. The sequence coordinates are denoted and repeat 


sequence is underlined. b, Table showing the mutation rate of indels (>3 bp) 


in intragenic region and promoter normalized by the length of the region 
suggests that the localized rate of indels is higher in the promoter than the 
intragenic region. c, First encounter between replication and transcription 
machineries generates indels. Model describing the first-encounter 
hypothesis proposed on the basis of results presented in Fig. 2a-f. In 
co-directional orientation under induced transcription (+IPTG), when 
an array of RNA polymerases (RNAPs) transcribe the gene, the replisome 
is likely to collide with the first transcription complex at the promoter or 
promoter-proximal regions. By contrast, when transcription is induced 

in head-on orientation, the replisome encounters the first transcription 
complex from the 3’ end. In support of this first-encounter model, when 


transcription is not induced (basal level) the density of RNAP is sparse along 


LETTER 


b 
Mutation rate of indels normalized to the length of the region (x10"") 
Position Co-directional Head-on Co-directional Head-on 
-IPTG -IPTG +I1PTG +IPTG 
oy 0.96 0.61 1.12 0.48 
a 0.52 0.55 0.38 1.18 
Promoter 10.2 5.79 15.8 1.17 
d 
Insertions Deletions 
0.5 0.3 
Ss 3 5 3 
} } 
x x 
i= i= 
2 2 
© © 
oO oO 
=| =f 
oO oO 
2 2 
@ @ 
iS) iS) 
73) 73) 
i g 
i= i= 
2 2 
rs) @ 
=) =) 
= = 
0.0 
---Co-directional -IPTG --- Head-on -IPTG 
— Co-directional +IPTG —— Head-on +IPTG 
+596 +614 +631 +646 
Ss GO SG TEC a PIG ee 5! 


parental sequence 
+5916: +614 FES +646 
cecaAGCAArcacatcccctrccGGAArccatr —/- reraTTCCacracaarcrcT TGCacce 


ecctCGTTacreraccecaaccCCTTacctaa —/- acatAAGGrcarerracacAACGrccc 


‘ 


mutant sequence 


cecaGCAAcacatretactGGAArccatr —/- tera TCCacracaatcteT TGCacce 
ecctCGTTcreraacatcaCCTTacctaa —/- acatAAGGrcarcrracacAACGrcec 


the gene, hence the sites of collisions are altered. In addition, it is possible 
that under basal transcription, replisome collides with either the RNAP 
complex arrested at the promoter or with the Lac repressor, which may 
explain the relatively high frequency of deletion at the promoter. Thus the 
first-encounter model of replication-transcription collisions supports the 
idea that collisions stall replisome progression, triggering indel mutations. 
d, Mutation rate of insertions and deletions (>3 bp) within 5’ or 3’ half 

of the intragenic region. Mean +s.e.m. of n > 3 experiments is shown. 

e, Models illustrating the different pathways that can lead to generation 

of indels after head-on collision-induced replication stalling: slippage, 
fork-reversal or template switching. f, Illustration of a complex mutation 
observed in thyP3 that is probably generated via microhomology-mediated 
break-induced replication. The complex mutation encompassing a 
deletion and insertion of an inverted region was observed under induced 
transcription in head-on orientation. The sequence coordinates are marked 
on the top with reference to the transcription start site (+1). 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a b c 
Total mutation rate Promoter Distribution of indels 
3 . 0.08 
-— = 
a -_ i=) 
2 2 = 
as S 0.06 oe 
G2 LAr a5 
< Sec = 
oo cg ome 
Bo 2 0.04 i 
35 g8 “8 
=] = 
224 =8 3 
8 3 0.02 g 
2 
0 0.00 : — 
_ Co- Haadon _Co- — Head-on — Co-directional ——Head-on 
directional directional 
d vee 
Base substitutions T tat 
in coding region 77€.,, mutation 
0.8 1.5 
. ee 
- -——X P 
22 0.6 ® 
ok 22 1.0 
Se S¥ 
= ac 
28 04 =o 
Pir) ao 
ae 3 5 
2 & og 0-5 
mo 0.2 oe 
£ ae 
RS) 
0.0 0.0 
Co- Co- 
directional 1eac-on directional “ead-on 
Extended Data Figure 7 | Role of recombination protein RecA in necessary for the collision-induced indels within the coding region. 
collision-induced mutations. a, Mutation rates of co-directional and d, Mutation rate of base substitutions in ArecA cells is higher in head-on 
head-on thyP3 strains for trimethoprim resistance in ArecA background. than co-directional orientation. e, The rate of T_7—C_7 mutation is 
Similar to wild type the mutation rate of head-on is higher than the higher in head-on relative to co-directional orientation in ArecA cells, 
co-directional strain, although the total rate of mutation is decreased thus promoter substitutions can occur at a higher rate independent of 
in a ArecA background. b, Rate of >3 bp indels at the promoter in recombination-mediated repair. All the fluctuation tests ina ArecA 
co-directional orientation is strongly decreased in ArecA cells, suggesting background were performed under inducing conditions (+IPTG). 
that indels at the promoter are mostly RecA-dependent. c, Intragenic Mean +s.e.m. of n> 3 experiments is shown. **P < 0.01; ***P < 0.001; 
distribution of >3 bp indels in ArecA is similar to the distribution Student's t-test. 


observed for wild type (Fig. 2f), thus suggesting that RecA is not 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a b c d 
_ 4 i Identical targets he a: nS Sarees NS At ea 8 
2 z Rest o the length of the region (x ) = _ 
5 & Intragenic | Promoter 2 ° & 6 
Sc Co-directional P= Bas <x 
e g APTG 0.33 1.10 BS g = , 
B25 Head-on 036 4.37 Se ge 
a6 -IPTG ? ; ZQ =o 
io” 
g2 Co-directional 540 4.71 i g£ 2 
a 8 +IPTG : ; 
Head- 0 
IPTG +IPTG  -IPTG +IPTG Sra | 0-22 5.54 Repressor Repressor Co- Head-on 
Co-directional Head-on bound unbound directional 
e f g 
1000 1000 
0 Co-directional Ciwild-type 
(0 Head-on Wi mutSL::kan wee 
ee | 
Wild-type Lee. a 100 -—H 
mutant 2 = 
ATGTGT alcGTGT @ & 100 22 
TACACA TGCACA £5 = 10 
ct 4 
4 Afllll restriction site © 22 
o ee 
56 ae 
=D oo 1 
—| nD 
G 10 se 
WT cee === Fes = 8 
TC ° ; = 
7 Vie on - err 
mutant (i: ae 0.1 
1 0.01 
wild-type mutSL::kan Tosa Cu saa T3407 C340 
h 
3 I va 
(1 Co-directional 
(J Head-on 
ao 2 
Sx 
Bc 
5 Oo 
as 
a2 
oo 
no oD 
C= | 
oo 
2 
0 


wild-type 


AadeC 


Extended Data Figure 8 | Base substitutions and the role of mismatch 
repair and enzymatic adenine deamination. a, IPTG induction does 

not affect the base substitution rate in the coding region of thyP3 when 
considering identical target sites, indicating that collisions may not be a 
major source of these mutations. In yeast, it was shown that transcription- 
associated mutagenesis is proportional to the level of transcription’. 

In B. subtilis, the total rate of base substitutions in the coding region 
decreases upon IPTG induction, which could be due to an unidentified 
transcription-dependent mutation-correction mechanism, or due to 
increase of target size of base substitutions in the coding sequence in 
uninduced (basal) transcription. b, Table showing the rates of base 
substitutions in the coding region and promoter of thyP3 normalized 

by length of the region. Localized substitution rates are higher in the 
promoter than coding sequence, thus suggesting that collision has a more 
drastic effect on promoter substitutions. c, Comparative genomic analysis 
of mutation rates of promoters with and without repressor binding. 
Nucleotide diversity per site (theta) was calculated for each promoter 
across different strains of B. subtilis. The comparison shows no significant 
difference in nucleotide diversity between repressor-bound promoters and 
the rest of the promoters, indicating that repressor binding may not affect 
the substitution rate of a promoter. Whole genomes and the repressors 


analysed are listed in Extended Data Table 2. NS, not significant, P > 0.05; 
Mann-Whitney U-test. d, The mutation frequency of T_7—C_7 mutation 
is higher in head-on than co-directional orientation in E. coli. The 
mutation frequency was calculated here from the plasmid-based forward 
mutation assay data reported previously”°. e, The restriction-digestion- 
based assay to screen for T_7—*C_7 mutation. Wild-type promoter 
sequence does not have an AfllIII restriction site, whereas the promoter 
T_7—C_, mutation will be digested by AflIII, which is illustrated by a 
representative agarose gel. f, Mismatch repair mutant (mutSL::kan) shows 
an expected increase (~60-fold) in total mutation rate of thyP3 in both 
co-directional and head-on orientation compared to wild-type. The mutation 
rates of the wild-type strains are presented in Fig. 1d. g, Mismatch repair 
mutant shows a marked ~1,000-fold increase in mutation rate of TC 
substitution hotspots within the coding sequence of head-on thyP3, 
indicating that mismatch repair corrects T—C substitution within the 
coding sequence. h, Deletion of adeC gene encoding adenine deaminase 
modestly reduces the mutation rate of T_;—C_, substitution in both 
co-directional and head-on orientation compared with wild type. For f-h, 
mean +s.e.m. of n > 3 experiments is shown. *P < 0.05; ***P < 0.001; 
Student's t-test. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 1 | Strains and plasmids 


a 
Name Genotype Source 
JDW437 (wild-type 168) trpC2 Lab stock 
JDW941 151 phi3T Ronald Yasbin 
JDW942 168 thyA: thyB Ronald Yasbin 
JDW1297 PY79 mutSL::kan Lyle Simmons 
JDW1543 168 AthyA This work 
JDW1544 168 AthyA amyE:: spank -thyP3 (head-on) spc This work 
JDW1563 168 AthyA amyE:: spank th yP3 (co-directional) spc This work 
JDW1711 168 AthyA amyE:: span thyP3 (head-on) spe, lacA: ee en SPOVG-lacZ This work 
JDW1814 168 AthyA amyE:: spanthyP3 (head-on) spc, mutSL: “kan This work 
JDW1900 168 AthyA amyE.:spc-P..,., thyP3 (head-on) /acl This work 
JDW1901 168 AthyA amyE:: ae "-thyP3 (co-directional) lac! This work 
JDW2054 168 AthyA amyE::P thy ( head-on) G,,,.A, 47, SPC This work 
JDW2057 168 AthyA amyE:: van -thyP3 (head-on) T ,>C_, spc This work 
JDW2185 168 AthyA amyE:: on x thyP3 (head-on) 7. (Cy spc This work 
JDW2190 168 AthyA amyE:: spank th yP3 (head-on) GA, spe, lacA: IP en SPOVG-lacZ This work 
JDW2192 168 AthyA amyE:: span thyP3 (head-on) T,—>+C_, spe, lacA:: Pe on SpoVG- “lacZ This work 
JDW2266 168 AyqjH BGSC 
JDW2284 168 AyxiJ BGSC 
JDW2288 168 ArecA BGSC 
JDW2491 168 thyA thyB aMyE--P ny thyP3 A 102-145 deletion spc This work 
JDW2492 168 thyA thyB aMyE--P sony thyP3 TT, j.4 Insertion spc This work 
JDW2501 168 AadeC BGSC 
JDW2529 168 AthyA amyE:: Sank -thyP3 (head-on) spc, AyqjH This work 
JDW2530 168 AthyA amyE:: spank thyP3 (co-directional) spc, AyqjH This work 
JDW2547 168 AthyA amyE:: span thyP3 (head-on) spc, AadeC This work 
JDW2548 168 AthyA amyE:: span thyP3 (co-directional) spc, AadeC This work 
JDW2598 168 AthyA amyE:: spank thyP3 (head-on) spc, ArecA This work 
JDW2612 168 AthyA amyE:: ant thyP3 (co-directional) spc, ArecA This work 
JDW2697 168 AthyA amyE::P___,,-thyP3 (head-on) spc, AyxlJ This work 
JDW2746 168 AthyA amyE:: wren -thyP3 (head-on) G,,,,A, 53. SPC This work 
JDW2747 168 AthyA amyE:: spank th yP3 (head-on) Coes Aces spc, lacA: ‘P yen"SPOVG-lacZ This work 
JDW2748 168 AthyA amyE:: spank -thyP3 (head-on) +1G,,,, spc This work 
JDW2749 168 AthyA amyE:: Soak -thyP3 (head-on) +1G,,.,. Spc, lacA::P,_,-spoVG-lacZ This work 

b 
Name Genotype Source 
pDR90 aMyE::P sony) AMP SPC David Rudner 
pDR110 aMyE::P. ane AMP Spc David Rudner 
pJW299 pEX44/I-Scel site amp cat Lab stock 
pJW331 pPDR9/amyE::P.,,.cmythyP3 (head-on) amp spc This work 
pJW395 pJW299/AthyA |-Scel site amp cat This work 
pJW396 pDR110/amyE::P. ao ,thyP3 (head-on) amp spc This work 
pJW397 pDR110/amyE:: P., er" -thyP3 (co-directional) amp spc This work 
pJW417 pEX44/lacA::P Spo VG-lacZ amp cat This work 
pJW430 pDR11 OlamyE::s spc-P..., nx thyP3 (head-on) lac! amp This work 
pJW431 pDR110/amyE:: “spc-P. * thyP3 (co-directional) lac! amp This work 


spank 


a, Bacterial strains used in this study. b, Plasmids used in this study. BGSC, Bacillus Genetic Stock Center (http://www.bgsc.org). 


© 2016 Macmillan Publishers Limited. All rights reserved 


Extended Data Table 2 | Primers, whole-genome sequences and 
transcriptional regulators 


a 

Name Sequence 5’—3’ 

oJW760 GGTGTCGACATGACTCAATTCGATAAACAA 

oJW761 AATGGCATGCCAATATTTCACCAATTTCAT 

oJW785 GTATGAATTCCAATATTTCACCAATTTCAT 

oJW1011 GCGGATAACAATTTCACACAGGGTCTTCTTGTTTCCACTGAT 
oJW1013 GCGGATAACAATTTCACACAGG CAATATTTCACCAATTTCAT 
oJW1052 GGTAGAATTCACGTTATGGTTAAGATTCAA 

oJW1053 AATGCTCGAGTATCCTTCTTTCATTTTCAG 

oJW1054 GGTACTCGAGTAGCAGGTATCCTAATTTCA 

oJW1055 AATGGGATCCCAGTCCAAATGACAATCTAT 

oJW1137 ATTGGCATGCTCGACTCTCTAGCTTGAG 

oJW1199 TGGTGTCAAAAATAACTCGACCTTCGATATGGGCGGATTCTT 
oJW1200 GAATCCGCCCATATCGAAGGTCGAGTTATTTTTGACACCA 
oJW1201 TGATGTTTGAGTCGGCTGATAGGGAAAAGGTGGTGAACTAC, 
oJW1202 GTAGTTCACCACCTTTTCCCTATCAGCCGACTCAAACATCAAA 
oJW1203 GGCTAAGAGAACAAGGAGGAGACGGTGGAAACGAGGTCATCATTT 
oJW1204 ATGACCTCGTTTCCACCGTCTCCTCCTTGTTCTCTTAGCC 
oJW1213 CATAAAGGCTAGGGATAACAGGGTAATCCGCTCACAATTCCACACAAC 
oJW1214 GCAGACGTTGCCATATCCAATTCAAGCTGGGGATCCTAGAAGCT 
oJW1215 CTTCTAGGATCCCCAGCTTGAATTGGATATGGCAACGTCTGCCC 
oJW1217 CAGAGGTTCCGATTTTAAC 

oJW1218 TCAATTCAGTAACATCGTTC 

oJW1221 GCTTCAGGATGATATTTACAA 

oJW1222 CAGGTGTTCGATATAATCAAG 

oJW1335 GTAAAACGACGGCCAGTGCGTTTCGGTGATGAAGAT 
oJW1336 ATTAAAAACTGGTCTGATCGCTATGCAAGGGTTTATTGTT 
oJW1337 AACAATAAACCCTTGCATAGCGATCAGACCAGTTTTTAAT 
oJW1338 AGGAAATCCATTATGTACTATTTAGTACGCCTCTTTTCTTTTC 
oJW1339 GAAAAGAAAAGAGGCGTACTAAATAGTACATAATGGATTTCCT 
oJW1902 CCTGACTGGGAAGAGGATGACG 

oJW1903 TCAGCTTTCATGGCTATCATTGAAC 

oJW1904 CTGGCTGGAAATACGCTTCTCG 

oJW1905 GATCAACGACGCTCAAGAGCTCA 

oJW1906 GGACTGTCCGCGTCGTTACGT 

oJW1907 GCTTCCTCGCTCCCTTGGG 

oJW2008 GGCATGAGCCTGGGCATGTG 

oJW2009 CTCCGTCTGCGTTTCGCAGTTC 
b 

Bacillus genomes NCBI_accession 

Bacillus subtilis subtilis 168 NC_000964.3 

Bacillus subtilis subtilis BSP1 CP003695 

Bacillus subtilis QB928 CP003783.1 

Bacillus subtilis 6(051HGW CP003329 


Bacillus subtilis spizizenii W23 NC_014479.1 
Bacillus subtilis subtilis RO-NN-1 CP002906 
Bacillus subtilis spizizenii TU-B-10 NC_016047 
Bacillus subtilis BSn5 NC_014976.1 
Bacillus amyloliquefaciens FZB42 NC_009725.1 


c 

Regulator name Function 

AbrB transcriptional regulator for transition state genes 

AhrC arginine repressor 

AraR transcriptional repressor of the ara regulon (Lacl family) 
BkdR transcriptional regulator 

CcpA transcriptional regulator (Lacl family) 

CodY transcriptional repressor CodY 

ComA two-component response regulator 

ComK competence transcription factor (CTF) 

CtsR transcriptional regulator 

DegU two-component response regulator 

Fnr transcriptional regulator (FNR/CAP family) 

Fur transcriptional regulator for iron transport and metabolism 
GinR transcriptional regulator (nitrogen metabolism) 

GItC transcriptional regulator (LysR family) 

GItR transcriptional regulator (LysR family) 

Hpr transcriptional regulator Hpr 

HrcA heat-inducible transcription repressor 

lolR transcriptional regulator (DeoR family) 

LevR transcriptional regulator (NifA/NtrC family) 

LexA transcriptional repressor of the SOS regulon 

MntR manganese transport transcriptional regulator 

Mta transcriptional regulator (MerR family) 

PerR transcriptional regulator (Fur family) 

PucR transcriptional regulator of the purine degradation operon 
PurR pur operon repressor 

ResD two-component response regulator 

RocR transcriptional regulator (NtrC/NifA family) 

SinR transcriptional regulator for post-exponenetial-phase-response 
Spo0A master regulator of sporulation 

SpollID transcriptional regulator of mother cell gene expression 
TnrA nitrogen sensing transcriptional regulator 

Xre Phage PBSxX transcriptional regulator 

Zur transcriptional regulator (Fur family) 


a, Primers used in this study. b, Whole-genome sequences used in this study. ¢, Transcriptional 


regulators analysed in this study. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


LETTER 


doi:10.1038/nature18324 


Allosteric coupling from G protein to the agonist- 


binding pocket in GPCRs 


Brian T. DeVree!*, Jacob P. Mahoney", Gisselle A. Vélez-Ruiz', Soren G. F. Rasmussen’, Adam J. Kuszak', Elin Edwald', 
Juan-Jose Fung’, Aashish Manglik*, Matthieu Masureel’, Yang Du’, Rachel A. Matt?, Els Pardon’, Jan Steyaert*, 


Brian K. Kobilka* & Roger K. Sunahara! 


G-protein-coupled receptors (GPCRs) remain the primary conduit 
by which cells detect environmental stimuli and communicate 
with each other!. Upon activation by extracellular agonists, these 
seven-transmembrane-domain-containing receptors interact 
with heterotrimeric G proteins to regulate downstream second 
messenger and/or protein kinase cascades!. Crystallographic 
evidence from a prototypic GPCR, the 8 -adrenergic receptor 
(82AR), in complex with its cognate G protein, Gs, has provided a 
model for how agonist binding promotes conformational changes 
that propagate through the GPCR and into the nucleotide-binding 
pocket of the G protein a-subunit to catalyse GDP release, the key 
step required for GTP binding and activation of G proteins”. The 
structure also offers hints about how G-protein binding may, in 
turn, allosterically influence ligand binding. Here we provide 
functional evidence that G-protein coupling to the 82AR stabilizes 
a ‘closed’ receptor conformation characterized by restricted access 
to and egress from the hormone-binding site. Surprisingly, the 
effects of G protein on the hormone-binding site can be observed 
in the absence of a bound agonist, where G-protein coupling driven 
by basal receptor activity impedes the association of agonists, 
partial agonists, antagonists and inverse agonists. The ability of 
bound ligands to dissociate from the receptor is also hindered, 
providing a structural explanation for the G-protein-mediated 
enhancement of agonist affinity, which has been observed for many 
GPCR-G-protein pairs. Our data also indicate that, in contrast 
to agonist binding alone, coupling of a G protein in the absence 
of an agonist stabilizes large structural changes in a GPCR. The 
effects of nucleotide-free G protein on ligand-binding kinetics are 
shared by other members of the superfamily of GPCRs, suggesting 
that a common mechanism may underlie G-protein-mediated 
enhancement of agonist affinity. 

Sequencing of the human genome revealed the magnitude of the 
GPCR superfamily, identifying over 800 genes encoding GPCRs, mak- 
ing this class of receptors the third-largest gene family’. Despite the 
varying nature of the chemical stimuli, which range from photons 
to small-molecule odorants and hormones to larger peptides and 
proteins, the generation of G-protein-mediated signals proceeds by a 
common mechanism. After activation, the receptor engages a heter- 
otrimeric G protein and catalyses release of GDP from the G protein 
a-subunit (Ga). Intracellular GTP then binds the nucleotide-free 
G protein, allowing it to regulate downstream effectors (adenylyl cyclase, 
phospholipase C, ion channels, and so on) to elicit cellular responses’. 
We recently used X-ray crystallography’, hydrogen—deuterium 
exchange mass spectrometry” and electron microscopy’ to charac- 
terize an agonist-GPCR-G- protein ternary complex in the absence of 
nucleotide. These studies revealed dramatic conformational changes 
in the G protein that are stabilized by binding to agonist-activated 


receptor and provided insight into the mechanism by which GPCRs 
bind G proteins to promote nucleotide exchange. Here, we suggest an 
explanation for the allosteric communication that links the nucleotide- 
binding site on the G protein to the hormone-binding site on the 
receptor, with a focus on conformational changes in the extracellular 
face of the receptor that alter access to the hormone-binding site. 
GPCR-G-protein interactions have historically been monitored 
using radioligand binding assays. Observations as early as the 1970s 


a + Apyrase 
30 - 
ne} 
S 204 
fo) a. 
Paral =- 
<E 
fe 
ay 
- 104 
0 T r 
2 x 
SF PS 
i x 
15 
@ NoGpP 
~@ 1nMGDP 
xe} 
5 10 ~@ 10nM GDP 
° 
al -@ 100nM GDP 
€ 
a ~@ iuMGDP 
a 5 
= -@ 10M GDP 
-@ 10M GTP\S 
0 T T T T T 1 
0 20 40 60 80 100 120 


Time (min) 


Figure 1 | Guanine nucleotides influence antagonist binding to 
G2AReGs complexes. a, Binding of 2nM [*7H]DHAP to 8,AReGs in the 
absence or presence of GDP. Addition of apyrase to GDP-bound 8,AReGs 
led to a progressive decrease in [*>H]DHAP binding over time, which could 
be restored with excess GDP. b, Addition of increasing concentrations 

of GDP enhances the rate and extent of [7H] DHAP binding to apyrase- 
treated 82 AReGs complexes. a, Data are shown as mean + standard error 
of the mean (s.e.m.) from m =3 independent experiments performed in 
duplicate. b, Data are representative of three independent experiments. 


1Department of Pharmacology, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA. @Department of Cellular and Molecular Physiology, Stanford University, Palo Alto, 
California 94305, USA. 3Structural Biology Research Center, VIB, Vrije Universiteit Brussel (VUB), Pleinlaan 2, 1050 Brussels, Belgium. “Structural Biology Brussels, Vrije Universiteit Brussel (VUB), 
Pleinlaan 2, 1050 Brussels, Belgium. *Department of Pharmacology, University of California San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093, USA. 


*These authors contributed equally to this work. 


00 MONTH 2016 | VOL 000 | NATURE | 1 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a b [SH]DHAP on c 
Full agonist 
ne ‘ 0.1 uM Nb80 
2 20 20 _ r) @ 0.1 
2 = o= 30 -a 1 uM Nb80 
Bes peaee 6 2 -- 10 1M Nb80 
ao + 3x 107 SE u 
LE 49 -* 1x 10% Ea 20 
fay + 3x 10-6 2Z 5 
TMS = 5 1x 105 £8 10 
*@ 3x105 
0 
T T T T T 1 
0 30 60 90 120 0 10 20 30 40 50 60 
Time (min) Time (min) 
Gas C terminus 
d Partial agonist e re Inverse agonist f 5 [SH]DHAP off 
-e Vehicle ne) 
NK bi ae -= 11M Nb80 5 
ot se 8 —- 10 uM Nb80 Oe 
N £ 10 gs a2 
i= So 6 x E 
ome &§ 7 z= 
(2) 5 5 -e Vehicle zg 4 7 r 
r aa ro) 
z= -= 10 uM Nb80 2 
0 0 T T T 1 r r : : 
0 30 60 90 120 0 15 30 45 60 0 30 60 90 120 


Time (min) 


Time (min) 


Figure 2 | Trapping active-state ,AR with Nb80 slows both antagonist 
and agonist association. a, Nb80 (red) mimics G protein (yellow) in both 
its binding site and the 8,AR conformation it stabilizes. The structure of 
Nb80-bound 8,AR (Protein Data Bank (PDB) accession 3P0G) is shown in 
orange, Gs-bound 3,AR (PDB accession 3SN6) in cyan. b, Pre-incubation 
of the 82AR with increasing concentrations of Nb80 progressively slows 
association of neutral antagonist [7H] DHAP to the 8AR. ce, Nb80 also 


suggested that G-protein coupling enhances agonist affinity for the 
receptor, and can be abolished by uncoupling the G protein from 
the receptor with guanine nucleotides’. These and other data formed 
the basis for the ternary complex model of agonist-receptor-G-protein 
interactions®’. In this paradigm, the active state of the receptor is sta- 
bilized by both the agonist and G protein, and enhancement of agonist 
affinity arises owing to the positive cooperativity between agonist and 
G protein. However, using purified B,AReGs complexes, we observed 
peculiar binding characteristics of the antagonist [*H]dihydroalpren- 
olol ((7H]DHAP) to 8,AR (Fig. 1a). As illustrated, addition of GDP 
increases the observed binding of a saturating concentration of [*H] 
DHAP, whereas removal of GDP using a nucleotide lyase, apyrase, 
decreases [7H]DHAP binding. The apyrase-mediated decrease in [*H] 
DHAP binding is reversed upon addition of excess GDP, suggesting 
that the decrease is indeed due to the formation of nucleotide-free 
82AReGs complexes. Removal of GDP from the 8,AReGs complex 
relies on the constitutive activity of 8,AR and the rapid hydrolysis 
(by apyrase) of GDP released from the a-subunit of Gs, Gsa. The 
nucleotide-free status of Gsa in these 3.AReGs complexes was con- 
firmed by rapid [?°S]GTP4S binding kinetics (Extended Data Fig. 1)° 
The observed deficit in [}H]DHAP binding to nucleotide-free 8,AReGs 
is the result of slower [7H]DHAP association (Fig. 1b and Extended 
Data Fig. 2). GDP enhances [7H]DHAP association in a concentration- 
dependent manner, with similar effects achieved by complete 3,AR*Gs 
uncoupling with GTP1S. Although nucleotides do not significantly 
affect the affinity (dissociation constant (Kq)) of [H]DHAP, their 
modulatory capacity is \-phosphate-dependent since GTPS is 
approximately tenfold more potent than GDP (Extended Data Fig. 3). 
Thus, 82AR bound to nucleotide-free G protein adopts a conformation 
characterized by restricted access to the hormone-binding site. 
Crystallographic and pharmacological evidence suggests that the 
active conformation of the 8,AR is stabilized by nucleotide-free Gs 
or by a single-chain camelid antibody raised against agonist-bound 
BAR (nanobody Nb80) (Fig. 2a)?!” As illustrated in Fig. 2b (and 
Extended Data Fig. 4a), Nb80 stabilizes a conformation of the B,AR 


2 | NATURE | VOL 000 | 00 MONTH 2016 


Time (min) 


slows association of full agonist [*H]formoterol (c), partial agonist [*H] 
CGP12177 (d), and inverse agonist [*H]carvedilol (e) to the 8,AR. 

f, Nb80 stabilizes the closed, active conformation and slows [*>H]DHAP 
dissociation from the AR in a concentration-dependent manner. 

b, f, Data are representative of three independent experiments. All other 
data are specific binding, shown as mean +s.e.m. from n =3 independent 
experiments performed in duplicate. 


that restricts [7H]DHAP association, similar to nucleotide-free 
Gs. Importantly, Nb80 also slows the association of full agonist, 
[°H]formoterol (Fig. 2c), as well as partial agonist, [7H]CGP-12177 
(Fig. 2d). These data suggest that in the nucleotide-free Gs- or Nb80- 
stabilized active state, the 8.AR adopts a closed conformation impair- 
ing access to the orthosteric ligand-binding site, regardless of the coop- 
erativity of the orthosteric ligand with the G protein. These data are 
in line with our previous observation that the 3-adrenergic receptor 
inverse agonist ICI-118,551 blocks the formation of 8,AReGs com- 
plexes, but is unable to disrupt preformed complexes’. Nb80 also 
impairs binding of inverse agonist [*H]carvedilol to the 8,AR by 
modestly decreasing the observed association rate (Fig. 2e) but dra- 
matically decreasing total binding, suggesting that Nb80 and [°H] 
carvedilol do not simultaneously occupy $2AR. 

Agonist-promoted G-protein engagement and subsequent nucle- 
otide loss would be expected to stabilize the active, closed receptor 
conformation, thus trapping the agonist in the orthosteric site and 
enhancing its observed affinity. Indeed, uncoupling G protein from 
receptor using the GIP analogue GppNHp has been shown to accel- 
erate agonist dissociation from the 8,AR’*. Such agonist-G-protein 
cooperativity is not predicted for neutral antagonists like alpren- 
olol, which do not stimulate G-protein coupling and thus should 
not stabilize the closed conformation. However, we have previously 
demonstrated that Gs can be ‘forced’ to form a complex with the 8.AR 
bound to the antagonist alprenolol!”, provided that free nucleotide 
is removed, indicating that antagonist-bound 3,AR retains enough 
basal activity to engage Gs. Consistent with this model, Fig. 2f and 
Extended Data Fig. 4b clearly illustrate a progressive slowing of [*H] 
DHAP (or [7H]CGP-12177, data not shown) dissociation in response 
to increasing Nb80 concentrations, suggesting that Nb80-mediated 
stabilization of the closed, active receptor conformation can trap [*H] 
DHAP in the orthosteric-binding site. 

Analysis of access to the hormone-binding sites in inactive- and 
active-state B»AR structures provides a structural rationale for the slow- 
ing of agonist and antagonist association (Fig. 3, Extended Data Fig. 6 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Figure 3 | Activation of the 8,AR closes the hormone-binding site. 

a, Stabilization of the 8,AR active conformation by Gs (or Nb80) brings the 
side chains of Phe193°°'? and Tyr308”*> closer to one another compared 
to their positions in structures in the absence of G protein. b, Closer view 
of the orthosteric site, highlighting Phe193°'? and Tyr3087*. Distances 
(in A) between the hydroxyl on Tyr3087*> and 2-carbon on the phenyl 
ring of Phe193®C' are indicated. c, d, A surface view comparing the 
extracellular face of 8AR in inactive (c) or active (d) conformations, 
showing how G-protein-stabilized structural rearrangements occlude the 
hormone-binding site in the active state. e, f, Cutaway view illustrating 
closure of the hormone-binding site around the bound agonist in the 
active state. The inverse agonist carazolol is shown in orange, the agonist 
BI-167107 is shown in yellow. 


and Supplementary Video 1). The binding of Gs or Nb80 to the 8, AR 
stabilizes a rearrangement of the cytoplasmic end of transmembrane 
domain 7 (TM7; Fig. 4a, b) that is accompanied by changes immedi- 
ately above the ligand-binding site, as well as a change in the structure 
of the extracellular loop (ECL) between TM4 and TM5 (ECL2). In 
comparison to the inactive B,AR, the structure of the 8. AR-Gs or 
B2AR-Nb80 (or related Nb6B9)'* complex identifies two aromatic 
residues, Phe193©25 % ECL2) and Tyr3087°°, that move approximately 
2-2.5 A closer to each other to form a lid-like structure over the 
orthosteric ligand-binding site. Lys305’* also contributes to capping 
the orthosteric site by trading its salt bridge! with Asp192*'* for 
an interaction with the backbone carbonyl of Phe1935C!? (F ig. 4c). 
These structural changes are stabilized in the active forms of B,AR 
bound to either the ultra-high-affinity agonist BI-167107 or the 
smaller, low-affinity agonist adrenaline'®, and formation of this ‘lid’ 
would be expected to sterically obstruct both ligand association and 
dissociation. 

To validate this structural model, we tested whether a residue 
smaller than tyrosine could modify the capacity of Nb80 to slow 
ligand association. Mutation of Tyr3087*° to alanine, previously 
shown to lower agonist affinity for the 8,AR"®, significantly dimin- 
ishes the capacity of Nb80 to slow the association of [>H]DHAP and 
even the agonist [*H]formoterol (Extended Data Fig. 5), as suggested 
by recent molecular dynamics simulations!”. Interestingly, and in 
contrast to [SH]DHAP association, pre-incubation with 101M Nb80 


Figure 4 | Allosteric communication between the 8,AR G-protein- and 
hormone-binding sites. a, In the $2AR active state (cyan), the cytoplasmic 
end of TM6 moves away from the receptor core by ~14 A relative to its 
position in the inactive-state structure, allowing for an inward movement 
of TM7. b, Rotation of TM7 allows Tyr326”°? (of the highly conserved 
NPxxY motif) to fill the space vacated by the conserved aliphatic residue 
1le278°. c, The rotation of TM7 repositions Tyr308”*° and Lys3057*?. 
This conformational change allows Lys305’” to coordinate the backbone 
carbonyl of Phe193"', stabilizing its movement towards Tyr308”*° to 
form a lid over the hormone-binding site. 


also enhances the extent of [*H]formoterol binding in the Y308A 
mutant. Eliminating barriers that impair access to the orthosteric site 
(for example, Y308A) allows the agonist to at least enter the receptor, 
where it can stabilize nanobody binding. The enhancement, therefore, 
is a reflection of the capacity of the agonist [*H] formoterol to cooper- 
atively stabilize Nb80 binding and vice versa, and concomitantly slow 
the dissociation of the bound agonist (Extended Data Fig. 5d). The 
data also suggest that while Tyr308”*° markedly limits access to the 
orthosteric site, other residues may work in concert with Tyr308”*° in 
the active 8,AR conformation to slow agonist dissociation. 

It is noteworthy that the movement of Phe193"©? and Tyr3087°° 
is not fully observed in the crystal structure of the 8,AR bound to an 
agonist alone!®, nor in the inactive-state structure of the 3,AR bound 
to the agonist isoprenaline'® (Extended Data Fig. 6 and Supplementary 
Videos 1, 2). Binding of G protein or G-protein mimetic (nanobody) 
is sufficient to stabilize the closed, active conformation since their 
effects on ligand-binding kinetics (as in Figs 1 and 2) are agonist- 
independent. An agonist may enhance G-protein engagement but 
poorly stabilizes the closed, active conformation by itself. Additionally, 
the data presented here suggest that formation of the closed, active 
conformation stabilized by the nucleotide-free G protein can occur 
owing to basal receptor activity, in keeping with predictions of more 
recent models of GPCR pharmacology such as the extended and 
cubic ternary complex models”?! (see Supplementary Discussion). 
Moreover, conformational changes stabilized by the nucleotide-free 
G protein influence not only agonist binding, but ligand binding in 
general, implying that the role of nucleotides needs to be included in 
an updated version of ternary complex model. 


00 MONTH 2016 | VOL 000 | NATURE | 3 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Closed, active 
(high affinity) 


Open, inactive 
(low affinity) 


Closed, active 


\ 
o 


“Shp 


Figure 5 | Basis for G-protein-dependent high-affinity agonist binding. 
Agonist binding promotes the G-protein-receptor (R) interaction and 
GDP release from the G-protein heterotrimer (Ga (a) GBy (8-y)). In this 
nucleotide-free state, the C-terminal helix of Ga remains embedded 

in the receptor core, stabilizing the conformational changes at both the 
intracellular and extracellular faces of the receptor. At the extracellular 
side, the orthosteric ligand-binding site closes around the bound 
agonist, sterically opposing agonist dissociation and thereby enhancing 
the observed affinity. Constitutive (basal) receptor activity may also 
activate the G protein, releasing GDP and thereby stabilizing the closed 
conformation of the receptor in the absence of an agonist. See also 
Extended Data Fig. 10. 


The capacity of G proteins to stabilize a closed receptor conforma- 
tion explains the poorly defined GITPyS-mediated increase in radio- 
labelled antagonist binding observed with several GPCRs, including 
muscarinic, «-adrenergic, adenosine and opioid receptors”-*> (as in 
Extended Data Figs 7 and 8). We analysed the behaviour of the M2 
muscarinic acetylcholine receptor (M2R) and the i1-opioid receptor 
(MOPr) to determine whether GTPS-mediated uncoupling relieves a 
G-protein-stabilized closed conformation. We focused on these recep- 
tors since structural models are available for both inactive and active 
conformations?® 7’, and to determine whether the mechanism we 
propose for the BAR is shared among other GPCRs. The active-state 
structure of the M2R, in particular, revealed similar conformational 
changes to the 8,AR in that a lid-like structure is formed above the 
orthosteric site’” (see Supplementary Videos 4 and 5). Although the 
structural changes are not identical, the effect of G proteins (or nano- 
bodies) on the association and dissociation of ligands at the orthosteric 
sites is shared among the B,AR, M2R and MOPr (Extended Data Fig. 9 
and Supplementary Video 3), suggesting that the allosteric effects of 
G proteins on orthosteric agonists may be manifested by conceptually 
common mechanisms. More discussion of the details and implica- 
tions can be found in Supplementary Discussion. Additionally, many 
recent studies have focused on the allosteric effect of sodium ions on 
class A GPCR ligand binding and signalling*®°. Outward movement 
of TM6 during receptor activation collapses the sodium-binding 
pocket in many class A GPCRs, thus it appears that loss of bound 
sodium is necessary for G proteins to stabilize a closed, active receptor 
conformation. 

The formation of the closed conformation is also of particular 
interest for the development of allosteric modulators targeting class 
A GPCRs. Most allosteric-modulator-binding sites have focused on 
the extracellular vestibule located above the orthosteric ligand-binding 
sites. For the muscarinic M2R for example, the potent M2R allosteric 


4 | NATURE | VOL 000 | 00 MONTH 2016 


positive modulator LY2119620 utilizes residues that form the lid in 
the active, closed conformation as described here, as the floor of the 
vestibule’’. Stabilization of this closed conformation may therefore be 
an important aspect on the differentiation between positive allosteric 
modulators, which enhance agonist binding and activation, and neg- 
ative allosteric modulators, which decrease agonist binding. 

We provide pharmacological and biochemical evidence suggest- 
ing that the closed, active conformation of GPCRs is stabilized by 
the nucleotide-free G protein, allowing G proteins to influence pas- 
sage of ligands to the orthosteric-binding site. The dramatic effect of 
G proteins on either ligand association or dissociation is consistent 
with, and in fact validates, structural models generated from X-ray 
crystallography in which G-protein coupling on the intracellular face 
of the receptor allosterically influences the structure of the extracellu- 
lar face. Agonist or hormone binding enhances G-protein engagement, 
whereby formation of the active receptor conformation is accompa- 
nied by nucleotide loss from the G protein. Therefore, the capacity 
of G proteins to enhance agonist-binding affinity is structurally and 
energetically linked to the agonist’s capacity to promote nucleotide 
loss from Ga. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 19 August 2015; accepted 13 May 2016. 
Published online 29 June 2016. 


1. Pierce, K. L., Premont, R. T. & Lefkowitz, R. J. Seven-transmembrane receptors. 
Nature Rev. Mol. Cell Biol. 3, 639-650 (2002). 

2. Rasmussen, S. G. et al. Crystal structure of the 82 adrenergic receptor-Gs 
protein complex. Nature 477, 549-555 (2011). 

3. Venter, J. C. et al. The sequence of the human genome. Science 291, 
1304-1351 (2001). 

4. Sprang, S. R. G protein mechanisms: insights from structural analysis. 

Annu. Rev. Biochem. 66, 639-678 (1997). 

5. Chung, K. Y. et a/. Conformational changes in the G protein Gs induced by the 

B2 adrenergic receptor. Nature 477, 611-615 (2011). 

6. Westfield, G. H. et al. Structural flexibility of the Gas a-helical domain in the 

B2-adrenoceptor Gs complex. Proc. Nat! Acad. Sci. USA 108, 16086-16091 

(2011). 

7: aguire, M. E., Van Arsdale, P. M. & Gilman, A. G. An agonist-specific effect 

of guanine nucleotides on binding to the beta adrenergic receptor. 

Mol. Pharmacol. 12, 335-339 (1976). 

8. Ross, E. M., Maguire, M. E., Sturgill, T. W., Biltonen, R. L. & Gilman, A. G. 

Relationship between the 8-adrenergic receptor and adenylate cyclase. 

J. Biol. Chem. 252, 5761-5775 (1977). 

9. DeLean, A., Stadel, J. M. & Lefkowitz, R. J. A ternary complex model explains 

he agonist-specific binding properties of the adenylate cyclase-coupled 

8-adrenergic receptor. J. Biol. Chem. 255, 7108-7117 (1980). 

0. Yao, X. J. et al. The effect of ligand efficacy on the formation and stability of 

a GPCR-G protein complex. Proc. Nat! Acad. Sci. USA 106, 9501-9506 

(2009). 

1. Rasmussen, S. G. et a/. Structure of a nanobody-stabilized active state of the B2 

adrenoceptor. Nature 469, 175-180 (2011). 

12. lrannejad, R. et al. Conformational biosensors reveal GPCR signalling from 

endosomes. Nature 495, 534-538 (2013). 

13. Lefkowitz, R. J. & Williams, L. T. Catecholamine binding to the B-adrenergic 

receptor. Proc. Natl Acad. Sci. USA 74, 515-519 (1977). 

4. Ring, A. M. et al. Adrenaline-activated structure of 82-adrenoceptor 

stabilized by an engineered nanobody. Nature 502, 575-579 

(2013). 

5. Bokoch, M. P. et al. Ligand-specific regulation of the extracellular surface of a 

G-protein-coupled receptor. Nature 463, 108-112 (2010). 

16. Kikkawa, H., lsogaya, M., Nagao, T. & Kurose, H. The role of the seventh 

ransmembrane region in high affinity binding of a B2-selective agonist 

TA-2005. Mol. Pharmacol. 53, 128-134 (1998). 

17. Dror, R. O. et al. Pathway and mechanism of drug binding to G-protein-coupled 

receptors. Proc. Natl Acad. Sci. USA 108, 13118-13123 (2011). 

18. Rosenbaum, D. M. et al. Structure and function of an irreversible agonist—B2 

adrenoceptor complex. Nature 469, 236-240 (2011). 

19. Warne, T. et al. The structural basis for agonist and partial agonist action on a 

8,-adrenergic receptor. Nature 469, 241-244 (2011). 

20. Samama, P,, Cotecchia, S., Costa, T. & Lefkowitz, R. J. A mutation-induced 
activated state of the $2-adrenergic receptor. Extending the ternary complex 
model. J. Biol. Chem. 268, 4625-4636 (1993). 

21. Weiss, J. M., Morgan, P. H., Lutz, M. W. & Kenakin, T. P. The cubic ternary 
complex receptor-occupancy model. |. Model description. J. Theor. Biol. 178, 
151-167 (1996). 


© 2016 Macmillan Publishers Limited. All rights reserved 


22. Burgisser, E., De Lean, A. & Lefkowitz, R. J. Reciprocal modulation of agonist 
and antagonist binding to muscarinic cholinergic receptor by guanine 
nucleotide. Proc. Nat! Acad. Sci. USA 79, 1732-1736 (1982). 

23. Bylund, D. B., Gerety, M. E., Happe, H. K. & Murrin, L. C. A robust GTP-induced 
shift in ag-adrenoceptor agonist affinity in tissue sections from rat brain. 

J. Neurosci. Methods 105, 159-166 (2001). 

24. Prater, M. R., Taylor, H., Munshi, R. & Linden, J. Indirect effect of guanine 
nucleotides on antagonist binding to Al adenosine receptors: occupation of 
cryptic binding sites by endogenous vesicular adenosine. Mol. Pharmacol. 42, 
765-772 (1992). 

25. Werling, L. L., Puttfarcken, P. S. & Cox, B. M. Multiple agonist-affinity states of 
opioid receptors: regulation of binding by guany! nucleotides in guinea pig 
cortical, NG108-15, and 7315c cell membranes. Mol. Pharmacol. 33, 423-431 
(1988). 

26. Haga, K. et al. Structure of the human M2 muscarinic acetylcholine receptor 
bound to an antagonist. Nature 482, 547-551 (2012). 

27. Kruse, A. C. et al. Activation and allosteric modulation of a muscarinic 
acetylcholine receptor. Nature 504, 101-106 (2013). 

28. Manglik, A. et al. Crystal structure of the p-opioid receptor bound to a 
morphinan antagonist. Nature 485, 321-326 (2012). 

29. Huang, W. et al. Structural insights into ,.-opioid receptor activation. Nature 
524, 315-321 (2015). 

30. Katritch, V. et a/. Allosteric sodium in class A GPCR signaling. Trends Biochem. 
Sci. 39, 233-244 (2014). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements We thank T. S. Kobilka for preparation of affinity 
chromatography reagents and F. S. Thian for help with cell culture. We 


LETTER 


thank J. Traynor and J. Tesmer for their support and use of their laboratory 
space for J.P.M. This work was supported by the Lundbeck Foundation 
(Junior Group Leader Fellowship to S.G.F.R.); Fund for Scientific Research 

of Flanders (FWO-Vlaanderen) and the Institute for the encouragement of 
Scientific Research and Innovation of Brussels (ISRIB) (E.P. and J.S.); National 
Institute of Neural Disorders and Stroke grant RO1-NS28471 (B.K.K.); the 
Mather Charitable Foundation (B.K.K.); National Institute of General Medical 
Sciences grants RO1-GM083118 and U19-GM106990 (B.K.K. and R.K.S.) 
and RO1-GM068603 (R.K.S.); National Institutes of Drug Abuse R21-031418 
(B.K.K. and R.K.S.); Michigan Diabetes Research and Training Center Grant, 
National Institute of Diabetes and Digestive and Kidney Diseases, P6O0DK- 
20572 (R.K.S.); University of Michigan Biological Sciences Scholars Program 
(R.K.S.) and the Rackham School of Graduate Studies (B.T.D.); Molecular 
Biophysics Training Grant T32GM008270 (B.T.D.); Cell and Molecular 
Biology Training Grant T32GM007315 (G.A.V.-R.) and Pharmacological 
Sciences Training Program T32GM007767 (J.P.M.); and AHA Midwest Affiliate 
Predoctoral Fellowship 13PRE17110027 (J.P.M.). 


Author Contributions B.T.D., J.P.M., G.A.V.-R., B.K.K. and R.K.S. designed the 
experiments. B.T.D., J.P.M., G.A.V.-R. and A.J.K. performed research; S.G.F.R., 
E.E., J.-J.F., A.M., M.M., Y.D., R.A.M., E.P. and J.S. contributed valuable reagents/ 
analytic tools; B.T.D., J.P.M., G.A.V.-R., B.K.K. and R.K.S. analysed data; and 
B.T.D., J.P.M., B.K.K. and R.K.S. wrote the paper. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of the paper. 
Correspondence and requests for materials should be addressed to 

R.K.S. (rsunahara@ucsd.edu). 


00 MONTH 2016 | VOL 000 | NATURE | 5 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


METHODS 


Large-scale purification of the 82AR. BAR bearing an N-terminal Flag tag and 
C-terminal 10 x-His tag was expressed in Sf9 cells (Invitrogen) and purified as 
previously described’. 

Expression and purification of G protein and nanobodies. Gs and Go heter- 
otrimer were expressed in HighFive (Invitrogen) insect cells using recombinant 
baculovirus and purified by chromatography on Ni-NTA, MonoQ, and Superdex 
200 resin, as previously described*!. Nanobodies were expressed in Escherichia coli 
and purified as previously described!!4”, 

Membrane preparations. HEK293T cells (ATCC) were used for small-scale 
expression and purification of AR and mutants. Cells were grown in DMEM plus 
10% FBS to ~70% confluency, then transfected with monomeric yellow fluorescent 
protein (mYFP)-8,AR (pCMVS5, 61g DNA per 10-cm plate) using Lipofectamine 
2000. Cells were harvested 40-48 h post-transfection in ice-cold lysis buffer buffer 
(50mM HEPES, pH 8.0, 65mM NaCl, 1 mM EDTA, 351g ml! phenylmethylsul- 
fonyl fluoride, 32g ml! each tosyl-t-phenylalanine-chloromethylketone and 
tosyl-L-lysine-chloromethylketone, 3.2 1g ml! leupeptin, 3.2 jg ml! ovomucoid 
trypsin inhibitor). The cell suspension was sonicated using a Branson Sonifier 
and centrifuged for 20 min at 25,000g. The pellet was resuspended in wash buffer 
(50mM HEPES, pH 8.0, 100 mM NaCl with protease inhibitors listed earlier) 
using a Dounce homogenizer, then centrifuged for 20 min at 25,000g. The pel- 
let was resuspended and homogenized in minimal wash buffer and the volume 
was adjusted to reach a final protein concentration of 5mgml-! as measured by 
the Bradford protein assay. Membranes were frozen by slowly pouring into liquid 
nitrogen and stored at —80°C until use. 

Enrichment of 8.AR and 8,AR(Y308A) from HEK293T cells. Frozen mem- 
branes were thawed on ice and NaCl, MgCl, and GTP)S were added to reach final 
concentrations of 300 mM, 1 mM and 10M, respectively. Timolol was then added 
to a final concentration of 11M and the membranes were incubated for 10 min on 
ice. Receptors were solubilized for 1h at 4°C in the presence of 1% dodecylmalto- 
side (DDM) and 0.1% cholesterol hemisuccinate (CHS). After centrifugation for 
30 min at 25,000g, the supernatant was applied to Ni-NTA agarose. The column 
was slowly washed with 20 column volumes of 20mM HEPES, pH 8.0, 300 mM 
NaCl, 0.1% DDM, 0.01% CHS to remove bound timolol. Receptor was eluted in 
the same buffer plus 200 mM imidazole and concentrated using an Amicon 30kDa 
cut-off spin concentrator for addition to the reconstituted high-density lipoprotein 
particles (rHDL) reconstitution mixture. 

Receptor reconstitution into rHDL particles. Reconstitutions were performed 
as described*’, with the amount of receptor added never exceeding 20% of the 
total reaction volume. For samples that contained Gs, the purified heterotrimer 
was added to the preformed 3,AR-rHDL particles, incubated for 2h at 4°C, and 
BioBeads (Bio-Rad) were used to remove the added detergent. Nucleotide-free 
Gse32AR complex was prepared by incubating 3,AR-Gs-rHDL particles with 
apyrase in the presence of 1mM MgCl, for 30 min at room temperature, or alter- 
natively, 2h at 4°C. If needed, the sample was passed through a Superdex 200 gel 
filtration column to remove free nucleotide and apyrase. 

Radioligand association experiments using rHDL particles. All assays were 
performed in Tris-buffered saline (TBS; 25 mM Tris-HCl, pH 7.4, 136mM NaCl, 
2.7mM KC)l) with a final concentration of 0.05% w/v bovine serum albumin 
(BSA). Reaction components were mixed and pre-incubated at room temperature 
(see later) before the addition of radioligand to initiate the association time course. 


Aliquots were withdrawn at the indicated times and filtered over Whatman GF/B 
filters pre-soaked in 0.3% w/v polyethyleneimine. Filters were washed with ice- 
cold TBS, dried, and subjected to liquid scintillation counting on a TopCount 
NXT (Perkin-Elmer). Bound ligand never exceeded 10% of the total ligand added. 
Kinetic binding experiments with [7H]DHAP and Nb80, 3,AR-rHDL. For asso- 
ciation experiments, receptor in rHDL was pre-incubated with varying concentra- 
tions of Nb80 and the reaction was started by addition of 5nM [3H]DHAP (Perkin 
Elmer). For dissociation experiments, the samples were first incubated with 5nM 
[SH]DHAP for 30 min, followed by incubation with varying Nb80 concentrations 
for 30 min. The reaction was started by adding 50 1M cold alprenolol. Non-specific 
binding was determined in the presence of 101M (+/—)-propranolol. 

Binding experiments with [7H]DHAP and Gse3,AR nucleotide-free complexes. 
For association experiments, gel-filtered samples of apyrase-treated Gse3,AR-— 
rHDL particles were incubated with 5nM [?H]DHAP to bind any receptor that was 
not complexed with Gs. The experiment was started by adding varying amounts 
of either GDP or GTPS. For ‘equilibriun’ binding experiments, samples were 
incubated with all the indicated components at room temperature for 90 min 
before filtration. Non-specific binding was determined in the presence of 101M 
(+/—)-propranolol. 

(7H) formoterol association to 8, AR. 8,AR-rHDL was incubated with the indi- 
cated concentrations of Nb80 for 30 min at room temperature. [*H]formoterol 
(Perkin Elmer) was added to reach 10nM final concentration. These assays also 
contained 1 mM ascorbic acid in the reaction buffer. Non-specific binding was 
determined in the presence of 101M (+/—)-propranolol. 

[(?H](—)-CGP-12177 association to BAR. 8,AR-rHDL was incubated with the 
indicated concentrations of Nb80 for 30 min at room temperature. [H](—)-CGP- 
12177 (Perkin Elmer) was added to reach 1 nM final concentration. Non-specific 
binding was determined in the presence of 10,1M (++-/—)-propranolol. 
[(?H]carvedilol association to 8,AR. Owing to high amounts of non-specific 
[?H]carvedilol (American Radiolabelled Chemicals) binding both to BSA and to 
the glass fibre filters typically used for separation, 8.AR-rHDL was diluted into 
empty rHDL particles rather than into a5x BSA solution (0.25% w/v BSA in TBS 
buffer) before addition to the assay mix. Using empty rHDL in place of BSA was 
critical for maintaining sample recovery from the assay plate while improving 
the signal-to-noise ratio of the assay. The receptor was incubated with the indi- 
cated concentrations of Nb80 for 15 min at room temperature, then for 30 min at 
4°C. [H]carvedilol was added to reach a 1 nM final concentration. Aliquots were 
withdrawn at the indicated time points and bound ligand was isolated using gel 
filtration on Sephadex G75 resin. Non-specific binding was determined in the 
presence of 10|1M (+/—)-propranolol. 


31. Kozasa, T. & Gilman, A. G. Purification of recombinant G proteins from Sf9 cells 
by hexahistidine tagging of associated subunits. Characterization of a2 
and inhibition of adenylyl cyclase by az. J. Biol. Chem. 270, 1734-1741 
(1995). 

32. Whorton, M. R. et al. A monomeric G protein-coupled receptor isolated in a 
high-density lipoprotein particle efficiently activates its G protein. Proc. Nat! 
Acad. Sci. USA 104, 7682-7687 (2007). 

33. White, J. F. et al. Structure of the agonist-bound neurotensin receptor. Nature 
490, 508-513 (2012). 

34. Krumm, B. E., White, J. F., Shah, P. & Grisshammer, R. Structural prerequisites 
for G-protein activation by the neurotensin receptor. Nature Commun. 6, 
7895-7895 (2015). 


© 2016 Macmillan Publishers Limited. All rights reserved 


100 


[2°S]GTPyS bound 
(fmol) 


0 100 200 300 
Time (sec) 


Extended Data Figure 1 | Confirmation of nucleotide removal from To a first approximation, the rapid binding event suggests that the complex 
G2AReGs by apyrase. Gs and Flag-tagged 3,AR were reconstituted in is empty of nucleotide, based on the limited temporal resolution of this 
rHDL and treated with the non-specific nucleotide lyase, apyrase. Samples —_ mixing and filtration technique. [*H]DHAP and [*°S]GTP*S binding to 
were applied to an anti-Flag affinity resin to remove products of the GDP the reconstituted complex yields a final R:G ratio of 1:0.95, suggesting 
degradation (GMP and P;). Samples were incubated with 100 nM [*°S] that up to 95% of the 62AR-rHDL particles contain a single functional 
GTPS at room temperature. At various times, samples were subjected to G protein. This suggests that only those G proteins associated with 

rapid filtration through glass fibre filters (GF/B) followed by 10 volumes the BAR will bind [*°S]GTP*S within this time frame in the absence 

of ice-cold buffer washes containing 101M GDP. Filters were dried and of receptor agonists. Data are shown as mean + s.e.m. from n = 3 

subjected to liquid scintillation counting (Top-Count, Perkin-Elmer). independent experiments performed in duplicate. 


© 2016 Macmillan Publishers Limited. All rights reserved 


-e no GDP 
-e& 1nMGDP 
-e 3nMGDP 
r Y -e 10nMGDP 
E -e 30nMGDP 
© 
=f -e 100nM GDP 
6 -e 300 nM GDP 
= -e 1 uMGDP 
-e 3 uM GDP 
AO 60 -e 10 uM GDP 
e . 
Time (min) a a 
0.20 
Te) 
: -e GDP 
5 0.15 
Mm -= 
= 0.10 
a e 
ahs 
= —_ 0.05 
) 
0.00 
-11-10 -9 -8 -7 -6 -5 
Log [GDP], M 
Extended Data Figure 2 | GDP accelerates [*H]DHAP binding response curve showing enhancement of the observed [7H] DHAP 
to 82AReGs. a, Time course monitoring [7H]DHAP association to association rate constant by GDP (half-maximum effective concentration 
apyrase-treated 3,AReGs complexes in the presence of varying GDP (ECs9) = 181+66nM). All data are shown as mean +s.e.m. from n=3 
concentrations. GDP increases both the observed association rate independent experiments performed in duplicate. 


constant and the maximum binding of [7H]DHAP. b, Concentration- 


© 2016 Macmillan Publishers Limited. All rights reserved 


—_ —_ -_ 


Specific H]DHAP & 
Bound (fmol) 
Oo Won © NY OA OW 


0123 4 5 
[SH] DHAP (nM) 


Ss 


oO) 10.9) 
) oO 


[SHIDHAP bound 
(% max) 
5 


-12 -10 -8 -6 -4 
Log [Nucleotide], M 


LETTER 


-@ Control 
H +10 uMGTPyS 


100 $ © GTP;S 


Extended Data Figure 3 | Effect of guanine nucleotides on enhance maximal [*H]DHAP binding in a concentration-dependent 
[SH]DHAP binding to 8,AReGs. a, In saturation binding assays, manner (GDP log(ECs9) = —6.42 + 0.12, or ECs9 386 nM; 

addition of GTPS to apyrase-treated 3A ReGs complexes increased the GTP%S log(ECs9) = —7.45 + —0.16, or ECs) + 35 nM). All data are 
observed maximal binding, Bmax, for (;H]DHAP without significantly shown as mean + s.e.m. from n = 3 independent experiments performed 
altering Ky (control: Bmax =5.5 + 0.52 fmol, Ky = 0.88 nM; +GTP Ss: in duplicate. 


Bax = 16.6 + 1.9 fmol, Ky=0.56nM). b, Both GDP and GTP4S could 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Q 


ty BH]DHAP (min) 


S 


0.20 


0.15 
0.10 
0.05 
0.00 


9-8 -7 - -5 -4 
Log [Nb80] (M) 


0.10 

= 0.08;  °@ 

Cc 

E 0.06 - 

Oo 0.04 e 

.= 
; e 
0.00 a A 

-9 7 = 


7 6 -5 -4 
Log [Nb80], M 


Initial Rate 


( 
jo) 
oO 
No 


100 é 


-6- Nb8s0 
@ Nb30 


[SH]DHAP Bound 
(%max) 


-12 -10 -8 -6 -4 
log [Nb] M 


Extended Data Figure 4 | Effect of Nb80 on antagonist binding to the 
B2AR. a, Association of [7H]DHAP is progressively slowed after 
pre-incubation of the 3.AR with increasing concentrations of Nb80. 

b, If PH]DHAP is allowed to first equilibrate with the 8, AR, Nb80 slows 
PH]DHAP dissociation from 3,AR in a concentration-dependent manner. 
c, Owing to the dramatic slowing of [7H] DHAP binding kinetics, Nb80 
(but not a control nanobody, Nb30, which has no effect on agonist affinity 
for 82AR) seems competitive with [7H]DHAP if insufficient time is given 
to reach equilibrium. Data shown are from assays incubated for 90 min 

at room temperature. All data are shown as mean + s.e.m. from n=3 
independent experiments performed in duplicate. 


© 2016 Macmillan Publishers Limited. All rights reserved 


-6- Control 


Specific [SH]DHAP 
Bound (fmol) 
aN 


0 5 
Time (min) 


wt-B,AR 


Specific [7H]Formoterol 
Bound (fmol) 
S 


0 5 


Extended Data Figure 5 | Y308A mutation abolishes the rate-slowing 
effects of Nb80. a, b, Time course of [7H] DHAP binding to wild-type 
(WT) B2AR (a) or B2AR(Y308A) (b) after pre-incubation of receptor 

with Nb80. Nb80 significantly slowed [?H]DHAP association to wild- 

type 82AR (—Nb80 observed rate constant, kops = 0.45 £0.05 min“! or 
association half-time, ty, = 1.5 + 0.2 min, +Nb80 k,ps= 0.20 + 0.03 min™! 
or ty =3.5+0.5 min; P=0.011 by an unpaired two-tailed t-test), 

but less effectively slowed [7H]DHAP association to 32AR(Y308A) 
(—Nb80 kops = 0.50 + 0.06 min~! or ty, = 1.40.2 min; +Nb80 

kobs = 0.32 £0.01 min™! or ty, =2.2+0.1 min; P=0.05 by an unpaired two- 
tailed t-test. All data are shown as mean +s.e.m. from n= 4 (—Nb80) or 
n= 3 (+Nb80) independent experiments performed in duplicate. c, d, Time 


© +10 uM Nb80 


10 


-~G- 0.1 uM Nb8s0 
-@® 10uMNb80 


10 
Time (min) 


LETTER 


-6- Control 
-e +10 uM Nbés0 


0 5 
Time (min) 


10 


Y308A 


-G- 0.1 uM Nb80 
-® 10uMNb80 


10 


Time (min) 


course of [*H]formoterol binding to wild-type 8AR (c) or 8: AR(Y308A) (d) 
after pre-incubation of receptor with Nb80. Nb80 slowed [*H]formoterol 
association to wild-type 82AR (0.1 1M Nb80 kops = 0.68 + 0.13 min™! 

or ty, = 1.0+0.2 min, 101M Nb80 k,p; = 0.27 £0.05 min”! or 

ty =2.6+0.5 min; P=0.031 by an unpaired two-tailed t-test). However, 
with 8,AR(Y308A), Nb80 had little effect on the observed association 

rate constant but enhanced the amount of [7H]formoterol bound (0.1 4M 
Nb80 kop; = 0.37 £0.11 min“! or ty = 1.9+0.6 min with a plateau of 
10.1+0.8 fmol, 101M Nb80 kop; = 0.53 £0.13 min! or fy=1.3+0.4min 
with a plateau of 21.3 + 1.2 fmol; unpaired two-tailed t-test of the kops 
values showed P= 0.4). All data are shown as mean +s.e.m. from n=4 
independent experiments performed in duplicate. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Inverse Agonist Agonist 


a) 
si 
B2AR S28 
aa 
¥ 5 
c 
co) 
© 
fe) 
Zz 
> 
+, 
82 
MOPr are 
oz 
c~_ 
-+- 
o. 
a 
M2R oD 
eS 
—-- 


Extended Data Figure 6 | The closed conformation stabilized by agonist — accession 2RH1; 8,AReGs, PDB accession 3SN6; 8, AR, PDB accession 
and G protein (or mimic). Illustrated are the crystal structures of agonist- | 2YCW;(3,AR-iso, PDB accession 2Y03; MOPr, PDB accession 4DKL; 
versus inverse-agonist-bound 82AR (cyan) and 8; AR (yellow), where only MOPr-Nb39, PDB accession 5C1M; M2R, PDB accession 3UON; 
B2AR is bound to G protein. Similarly, the MOPr (orange) adopts a closed M2R-Nb9-8, PDB accession 4MQS. 

conformation upon binding the G-protein surrogate, Nb39. 8,AR, PDB 


© 2016 Macmillan Publishers Limited. All rights reserved 


qd) 
= ws -& Control 
S 
= 80 7 - +GTPys 
oO 
oo 60 
= 5 
IT > 40 
OQ. 
ie 20 
ae 
0 


-12 -10 -8 -6 -4 
Log [Isoproterenol], M 


= 


—_ 


Oo NN FF DO WO O 


-® Control 
- +GTPyS 


[SH]DPN Bound 
(fmol) 


-10 -8 -6 -4 
Log [morphine], M 


Extended Data Figure 7 | Effect of guanine nucleotides on [7H] 
antagonist binding are also seen in competition binding assays. 

a, Agonist (isoproterenol) competition binding using apyrase-treated 
B2AReGs complexes shows the characteristic G-protein-dependent shift 
in agonist affinity, along with a dramatic increase in total [*>H]DHAP 
binding, upon the addition of 1011M GTPS. b, Normalization of the data 
from a yields a plot representative of what is commonly reported in the 


-12 


LETTER 


<2 


2 -@ Control 
5 100 -- +GTPyS 
m= 80 
cE 60 
5 40 
= 20 
_ 0 


-12 -10 -8 6 -4 -2 
Log [Isoproterenol], M 


d) 


_ 120 -® Control 
= 100 ! + +GTPyS 
oO = 
hm x 80 
z E 60 
(eo) 
Os 40 
0 
-12 -10 -8 -6 -4 


Log [morphine], M 


literature. c, Similar to the B,AR, agonist (morphine) competition binding 
using MOPreGo complexes shows the characteristic G-protein-dependent 
shift in agonist affinity, along with a dramatic increase in total [7H]DPN 
binding, upon the addition of 101M GTP.S. d, Normalization of the 

data from c. All data are shown as mean + s.e.m. from n = 3 independent 
experiments performed in duplicate. 


© 2016 Macmillan Publishers Limited. All rights reserved 


a) 


o 


op) -® Vehicle 
= 4 
axe) —-* +GTPys 
rE 444K ag! 
oe lf Be 
es SES 
o 2 2 Fos 
em a rr 
0 
0 5 10 15 20 25 30 
b) Time (min) 
= 100 
a= Xx 
ae 80 
c= 
5 = 60 
GS 40 
=> -@ 10 uM Nb9-8 
aw, 20 -=- 100 uM Nb9-8 
0 
0 10 20 30 
Time (min) 
c) 
zm 20 -@ Apyrase 
QO — -# GTPyS 
oe 15 
TE 
2 s 
o Z 10 e 
es SE 
OO «= ae 
6 
op) 


0 10 20 30 
Time (min) 


Extended Data Figure 8 | The MOPr and M2R behave 
similarly to the 82AR when bound to nucleotide-free 
G protein or an active-state-stabilizing nanobody. a, After 
apyrase treatment of M2ReGo complexes, addition of 10 11M 
GTP%S enhances association of [7H]N-methylscopolamine 
(H]NMS) to M2R (vehicle k,p; = 0.32 + 0.02 min! or 
t4=2.2+0.1 min, +GTPYS kops = 0.54 + 0.02 min“! or 
ty=1.3+0.1 min; P=0.002 by an unpaired two-tailed t-test). 
Data are shown as mean + s.e.m. from n= 3 independent 
experiments performed in duplicate. Addition of GDP was 
also able to increase the rate of [7H]NMS binding (inset; 
log (ECs) = 6.91 £0.18 or ECs9 © 123 nM; mean + s.e.m. 
from n=2 independent experiments performed in duplicate). 
b, Pre-treatment of M2R with either 101M (black circles) or 
100M (red squares) Nb9-8 (ref. 27) impairs association of 
PH]iperoxo to M2R (101M Nb9-8 kop; = 0.68 + 0.09 min“! or 
y= 1.0+0.2 min, 10011M Nb9-8 k,p,= 0.25 + 0.04 min“! or 
ty =2.8 + 0.5 min; P= 0.04 by an unpaired two-tailed t-test). 
Data are shown as mean +s.e.m. from n=3 (10 uM 
Nb9-8) or n= 2 (100}1M Nb9-8) independent experiments 
performed in duplicate. c, Addition of 10,.M GTPS to 
apyrase-treated MOPreGo complexes hastened association 
of the antagonist [*H]diprenorphine ([7H]DPN) to MOPr 
(apyrase kos = 0.06 + 0.02 min“! or ty, =9.8 + 1.3 min, 
+GTPYS kops = 0.12 £0.01 min“! or ty, =5.6 + 0.6 min; P=0.1 
by an unpaired two-tailed t-test). The effect of nucleotide-free 
G protein was recapitulated by pre-incubating MOPr with Nb39 
(ref. 28) (inset; control kp, = 0.13 + 0.01 min™!, +100,.M Nb39 
kops = 0.07 + 0.02 min™!). Data are shown as mean +s.e.m. 
from n=2 (MOPreGo) or n=3 (MOPr + Nb39) independent 
experiments performed in duplicate. 


© 2016 Macmillan Publishers Limited. All rights reserved 


BUGGY RESEARCH | 
Inactive MOPr Active MOPr 


PDB: 4DKL PDB: 5C1M 


Partially 
Inactive NTS-1R Active NTS-1R 


PDB: 4XEE 


its inactive conformation (purple) is compared to the Nb39-bound 
(G-protein mimic) form in blue. Similarly, the inactive NTS-R1 (ref. 33) 
(green) is compared with a mutant NTS-RI1 (ref. 34) that adopts a partially 
(or partially active neurotensin receptor 1, NTS-R1) conformations of the active conformation (orange). MOPr, PDB accession 4DKL; MOPr-Nb39, 
MOPr and NTS-R1 from the top or extracellular view of the receptor. PDB accession 5C1M; NTS-R1, PDB accession 4GRV; active-like NTS-R1, 
The surface rendering highlights residues or structure on the extracellular | PDB accession 4XEE. 

face that change upon receptor activation (circled). The MOPr in 


PDB: 4GRV 


Extended Data Figure 9 | The extracellular regions in the active 
conformations of peptide hormone/agonist receptors MOPr and 
NTS-RI1. Illustrated are the crystal structures of the inactive and active 


© 2016 Macmillan Publishers Limited. All rights reserved 


Extended Data Figure 10 | Model of G-protein-dependent high-affinity _b, For family members such as MOPr or NTS-R1, where the peptide 


agonist binding. a, b, As in Fig. 5, nucleotide-free G-protein-stabilized hormones/agonists are considerably larger, the influence of the G-protein- 
family A GPCRs experience alterations in the extracellular face of the mediated changes in the extracellular domain structure result in similar 
receptor, thus affecting the orthosteric-binding site. In a monoamine effects on orthosteric ligand dissociation. Rather than closing over the 
receptor such as the 3,AR, G-protein binding and GDP loss accompanies orthosteric site as with monoamine receptors as in a, the extracellular face 
the stabilization of a closed, active conformation of the receptor, as in a. may contain structures and residues that ‘pinch’ the larger ligands. 


© 2016 Macmillan Publishers Limited. All rights reserved 


NICO SCHERF, KONSTANTIN THIERBACH, GOPI SHAH, INGO ROEDER, JAN HUISKEN 


JASMIN IMRAN ALSOUS & PAUL VILLOUTREIX 


TOOLBOX 


THE VISUALIZATION 
TRANSFORMING BIOLO 


Inventive graphic design and abstract models are helping 
researchers to make sense of a glut of data. 


ZEBRAFISH EMBRYO 


Anterior 


Cells that go on to form 
the brain/nervous system 
Cells that develop into inner 
organs/connective tissue 


Positions of cells 
(over 20 mins) 


Posterior 


Trajectories of cells 


2D (Mercator) projection of cellular highways 


The ‘flow’ of cells in a developing zebrafish embryo, seen in 3D microscopy data (left) and as a 2D projection (right). 


BY EWEN CALLAWAY 


smart visualization can transform 
Aisess understanding of their data. 

And now that it’s possible to sequence 
every RNA molecule ina cell or fill a hard drive 
ina day with microscopy images, life scientists 
are increasingly seeking inventive visual ways 
of making sense of the glut of raw data that 
they collect. 

Some of the visualizations that are currently 
exciting biologists were presented at a confer- 
ence at the European Molecular Biology Lab- 
oratory in Heidelberg, Germany, in March. 
Called Visualizing Biological Data (VIZBI), 
the meeting was co-organized by Sean 
O’Donoghue, a bioinformatician at the 
Garvan Institute of Medical Research in 
Sydney, Australia. The gathering attracts 
an eclectic mix of lab researchers, com- 
puter scientists and designers and is 
now in its seventh year. 

Here, Nature highlights some of 
O’Donoghue’s picks of the visualizations 
that are set to transform biology. 


STREAMLINED CELLS 

Bioinformatics postdoc Nico Scherf watches 
cells shift paths to form different germ lay- 
ers and then organs in developing zebrafish 
embryos using ‘light-sheet microscopy’ tech- 
niques developed by his supervisor at the Max 
Planck Institute of Molecular Cell Biology and 
Genetics in Dresden, Germany. But, he says, 
when tracing the path of every single zebrafish 
cell, “you end up with a hairball” of tracks. To get 
some meaning out of these hairballs, Scherfbor- 
rowed some fluid-dynamics approaches used to 
analyse atmospheric and ocean currents. “You 
only plot the major streamlines, which gives you 


Cell interconnections in a Drosophila egg chamber 
(left) are represented as a 1D network (right). 


the highways of cellular motion,” he says. To 
achieve this, Scherf wrote some software to ana- 
lyse the images, and will share it with others on 
request. So far, his approach has revealed that 
a mutation that causes abnormal organ devel- 
opment alters the movement of cells only very 
early in zebrafish development. And he thinks 
that people who are studying the development 
of other organisms could benefit from getting 
into the flow of things, too. 


ABSTRACT CONNECTIONS 
Jasmin Imran Alsous, a developmental biologist 
at Princeton University in New Jersey, took 
inspiration from Picasso when trying to make 
sense of microscopy images of a fruit fly’s 
egg chamber, a torpedo-shaped cluster that 
forms when a germ cell goes through four 
incomplete and asymmetrical divisions. 
The final result is a network of 16 inter- 
connected cells that constitute both the 
developing embryo and the surrounding 
cells that nourish it. 
Alsous’s adviser had sent her an article 
about Picasso lithographs that depicted > 


CORRECTED ONLINE 7 JULY 2016 | 7 JULY 2016 | VOL 535 | NATURE | 187 
© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


* Fatty acids ~ . Insulin”: 
a He eee Les 


Osec | 


Foxo3a 


SSS - ,' 
WROD SISTISTISI ST SIS)... SD...) ISI... SOS) IIIT... SI. SI. SWI). ST. SIS’ Gp 
C2 
2 


Nucleus 


S 
WIM VIMIVISIN MIIMIMISINISI MI MISISINISO I MISISINISOMIMISININISISISINIISIMISISIOS 


Sy 


BAD 
$112 


p9ORSK S326 


T-412 


p70S6K 


Plasma membrane 


een 


, St a ae nene(<=) an ate~eeeneee a 
wa -y-ti79 1 Sg a RR i 
j al 4 + t 
| PDK1 eNOS Pfktb2 Pfkfb2 
S-1176 S-486 S-469 
Hst Irs 
Y-460 Y-891 Y-1173 Y-608 
mTORC1 g TSC2 TSC2 PRAS40 TSC2 ¥ 
urea $-939 T-1465 7-247 S-981, o 
$-757| AS160 ——> az 
T-649 $-324 $-348 $-595 
Plin mTORC2 _¥ SIN1 
T-86 
Gsk 
ee) ispel 
P70S6K 
Cytoplasm Ndrg1 8355 


15 sec .-. = SE 


sao UH saa, ve] “0 
: “> HSF1 
$559 $557 
P. MSK1 = mTORC2 
— $375 mTOR 
Grb10 z S-641 $-2481 
$-503 Erk 
T mTOR 
. [sews 
* * Eif4ebp1_ 
. $-64 
See Rs ee 
Glycolysis 


A ‘Minardo’ chart that visualizes a cascade of protein phosphorylation after a cell is treated with insulin. 


> increasingly abstract renderings of a bull. 
She thought the same principle could apply to 
depictions of the egg chamber. 

She transformed fluorescent microscope 
images of the chamber into a string of num- 
bers that unambiguously represent how each 
cell connects to the others. Using this abstrac- 
tion, she has found that some of the 72 possible 
configurations of the egg chamber are much 
more common than others. She is now test- 
ing whether the different configurations affect 
how fruit-fly embryos grow and develop. 


ABETTER MODEL OF THE CELL 
O’Donoghue says that his first attempt to 
visualize how fat cells respond to insulin ended 
up as a hairball of criss-crossing molecular 
pathways. A colleague had measured how 
hundreds of different protein types in a cell are 
phosphorylated (which tends to activate them) 
in response to insulin over the course of an 
hour, when the cell stops burning fat 
to produce energy and starts bring- 
ing in sugars and storing fats. 
To tame the hairball, 
O'Donoghue found inspira- 
tion in a famous chart created 
by the nineteenth-century 
French civil engineer Charles 
Joseph Minard. The image 
charted Napoleon’s disastrous 
invasion of Russia, and integrated 
six kinds of data — including troop 


numbers and geography — in two dimensions. 
O’Donoghue’s “Minardo’ chart visualizes an 
insulin-treated cell like a clock, with consecu- 
tive phosphorylation events moving clock- 
wise around the cell. It also depicts a protein’s 
location in a cell and its relationship to other 
molecular players. 

One of the major insights from the visualiza- 
tion, O'Donoghue says, is how quickly the cell 
responds to insulin, with many changes occur- 
ring in the first 15 seconds. “A lot of people 
in the community were quite shocked by the 
suddenness.” He is eager for others to use the 
approach to map other dynamic events, such as 
the cell cycle, and has created an online guide 
for doing so. But, for now, he says, “you have to 
do a lot of manual tweaking”. 


INSIDE OUT 
Illustrator Graham Johnson is used to depicting 
the internal life of cells by hand. Now direc- 
tor of the Animated Cell project at the 
Allen Institute for Cell Science 
in Seattle, Washington, Johnson 
got his start doing illustra- 
tions for a cell-biology text- 
book. “Despite painstaking 
efforts to be accurate, it was 
always easy to make mis- 
takes,” he says — particularly 


A CellPACK 3D molecular model 
of a Mycoplasma mycoides cell. 


188 | NATURE | VOL 535 | 7 JULY 2016 | CORRECTED ONLINE 7 JULY 2016 
© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


T Phosphorylation 


; the Dephosphorylation 


when depicting the relative size of cellular 
components. “When youre creating a visualiza- 
tion, you're establishing what will be the mental 
model for many current and future biologists,” 
so accuracy is key, he adds. 

To make cellular model-making more 
systematic, Johnson developed a tool called 
cellPACK. To use it, researchers use experi- 
mental data to create a series of physical rules 
(a ‘recipe’) by which defined cellular compo- 
nents such as proteins, lipids and nucleic acids 
(the ‘ingredients’) fill a space. Johnson would 
like to create a platform such that the models 
are automatically updated when new data are 
generated. But despite lots of interest from other 
researchers, most life scientists find that the tool 
requires too much time and effort to be very 
practical. “Its months of research to generate a 
recipe from scratch,’ says Johnson, who plans to 
release a more streamlined web version of the 
software later this year. 

The tool isn’t just for making visually 
striking models, he emphasizes. It can also help 
scientists to come up with testable hypothe- 
ses. His team created a model of the internal 
structure of HIV and used it to predict how 
the protein that forms the outer shell interacts 
with an internal protein. Johnson says that a 
virologist recently got in touch with him to 
say his conclusions gleaned from cellPACK 
checked out experimentally. “He has a bunch 
of new data, and he wants to work with us to 
build new models.” m 


D.K.G. MAETAL. CELL 161, 948 (2015) 


G. T. JOHNSON ET AL. NATURE METH. 12, 85-91 (2015) 


DARIO PIGNATELLI/BLOOMBERG VIA GETTY 


CAREERS 


GERMAN ACADEMIA One thousand new 
tenure-track posts p.190 


PUBLISHING JOBS A production editor 
describes her day go.nature.com/296ghpw 


NATUREJOBS For the latest career 
listings and advice www.naturejobs.com 


Astint as a postdoc is beneficial whatever research career students are intending to pursue. 


COLUMN 


Keep it moving 


A postdoc job is good for your career, but don’t get stuck 
in an academic cul-de-sac, says Seren-Peter Olesen. 


after earning your PhD? My view is that 

you should. This is provocative advice 
in the face of data that clearly substantiate a 
worldwide oversupply of researchers who 
have completed such a post. Yet I am not 
suggesting that you undertake multiple post- 
docs, as many junior researchers do. Instead, I 
believe that a single postdoc term will benefit 


S hould you take up a postdoctoral position 


your career if you want to stay in research. 

As director of the Danish National Research 
Foundation (DNRF) in Copenhagen, I’ve 
watched numerous trainees, including those 
in my own lab, make remarkable progress 
within a short time when they are exposed to 
the challenges that a postdoc role provides. 
These challenges offer an ideal background 
for rethinking and redefining your career away 


© 2016 Macmillan Publishers Limited. All rights reserved. 


from academia (or in it, if you're one of the 
few fortunates). I’ve gathered evidence from 
junior researchers who have worked in my lab, 
and from interviews with and a survey of for- 
mer postdocs that further support my advice. 
Industrial employers from leading research- 
intensive companies in Denmark, to whom we 
presented these results, told us that they prefer 
candidates who have completed one term of 
postdoctoral research. 

A postdoc placement is, of course, 
almost obligatory for an academic-research 
career — but the unfortunate and often-cited 
reality is that few tenure-track posts are avail- 
able anywhere in academic research. 

Yet a postdoc is valuable to you no matter 
what research career you pursue or in which 
sector you pursue it. You further develop 
your scientific and research skills and talents 
by working more independently on original 
problems, using innovative techniques; and 
you complement the abilities that you acquired 
during your PhD programme. In a postdoc 
role, you take more responsibility for the 
research; you learn how to manage others and 
apply for funds; and you are likely to receive 
greater exposure to a workplace in which many 
of your colleagues are from different nations. 
These are highly useful competencies for a 
research position in any sector. 

Does a single postdoctoral stint help you to 
win an industrial research position? I believe 
that it does. I have watched junior research- 
ers in my lab advance smoothly into indus- 
trial research careers after one postdoc term. 
Of the roughly 30 postgraduates who have left 
my lab over the past 11 years for a research job 
in industry, 21 obtained such jobs after one 
postdoc. Many of those successful candidates 
had competed with up to 100 other applicants. 

With a single postdoc behind them, young 
scientists are highly attractive to the scientific 
community. The Danish industry representa- 
tives who attended our presentation of survey 
and interview results stated unequivocally 
that they would rather hire scientists who had 
completed one postdoc at a highly ranked 
international university than someone who 
had just finished a PhD. And they reiterated 
their stance at a round-table discussion in June. 
But although one postdoctoral stint provides 
great value, the same cannot be said for two 
or more. The same industrial employers said 
that they might lose interest in candidates who 
have done many years of postdoctoral training. 

To be sure, the glut of researchers who 
have finished postdocs is no different in > 


7 JULY 2016 | VOL 535 | NATURE | 189 


> Denmark than it is anywhere else. Their 
numbers, and the number of non-tenured 
assistant professors, nearly doubled between 
2006 and 2013, reaching 3,598, whereas the 
number of associate professorships grew by 
less than 25%, to 4,443. Of these, just 5-10% 
become available each year. Many people 
with postdocs work hard on short-term 
contracts while waiting for a vacant profes- 
sorship. Most will wait in vain. 

Many junior scientists do multiple post- 
docs, in part to further their dream of a 
professorship and partly because they see 
no clear alternative. But it is clear from 
our survey of and interviews with former 
postdoc researchers (which we conducted 
between 2014 and 2015) that aiming for 
academia through multiple postdocs is 
unlikely to bring career satisfaction. The 
400 participants had done postdocs between 
2007 and 2014 at DNRE centres of excellence 
(research units embedded in Danish univer- 
sities or research institutions). Of the 20% 
who now work in industry as researchers or 
managers, 85% said that they were very or 
fairly satisfied with their current job. And 
they reported greater satisfaction with their 
job security and career opportunities than 
did those in academia, including researchers 
currently doing a postdoc. 

Yet half of the interviewees and survey 
respondents consider it unlikely or very 
unlikely that they will get a non-academic 
job, mainly because they think that they lack 
the necessary competencies. Most postdoc 
researchers whom I have interviewed also 
believe that they are on the path to a career in 
academia — though the disheartening truth 
is that even if you are a great scientist, there 
is often no place for you there. But it is clear 
from our survey and interviews that many 
people do up to three postdocs, increasing 
the risk that a potential employer, especially 
in industry, will see them as too specialized. 

Ask yourself and your supervisor during 
your first postdoc whether you should aim 
beyond an academic career — and demand 
career advice and mentoring from people 
who workin relevant research-based indus- 
tries or in the public sector. You need strong 
and specific career advice, including expo- 
sure to role models with careers outside 
academia. Only 20% of the postdocs in our 
survey had received such guidance. 

You must control your own career. Don't 
languish in a sector in which there might be 
no position for you, even if it seems risky to 
leave academia. A willingness to take risks is 
characteristic of a great professional life. As 
the Danish philosopher Soren Kierkegaard 
said: “To dare is to lose one’s footing for a 
while. Not to dare is to lose oneself?” m 


Soren-Peter Olesen is director of the 
Danish National Research Foundation in 
Copenhagen. 


190 | NATURE | VOL 535 | 7 JULY 2016 


UNIVERSITY JOBS 


Germany to fund 
tenure-track posts 


Federal government will create 1,000 professorships. 


BY AMBER DANCE 


erman President Angela Merkel and 
ex prime ministers have signed a 

€1-billion (US$1.1-billion) agreement 
to fund 1,000 new tenure-track professorships, 
in the hopes of retaining and recruiting top 
academic talent in the nation. 

According to the Nachwuchspakt (‘junior 
pact’), as the contract is known, the federal 
government will pay young professors as they 
work towards tenure, after which state-funded 
universities will assume financial responsibility. 

“It’s the first time that the federal govern- 
ment, as far as I know, is investing such a lot 
of money into the careers of young scientists,” 
says Christian Schafer of the German Aca- 
demic Exchange Service in Bonn. The agree- 
ment, signed on 16 June, reflects an effort to 
improve the job situation for young research- 
ers in Germany, where tenure-track positions 
are rare. Scientists typically work in temporary 
posts until they are eligible for a faculty spot 
— usually not until 


their early 40s, at “Because of 
which point it is dif- the perceived 
ficult to startanon- insecurity, 
academic career. there are great 

Schaferandmany minds who leave 
young researchers the academic 
say that the agree- world.” 


ment is a positive 

step — but that more needs to be done. “It’s 
better than nothing,’ says Andreea Scacioc, a 
structural biologist in Gottingen, who earned 
a PhD in 2014. “But it’s too little” 

Every year, about 28,000 PhD and medical 
students graduate from German universities. 
There are about 25,000 actively employed 
professors, according to the German Asso- 
ciation of University Professors and Lecturers 
(DHV). The Society of Junior Professors, a 
national advocacy group for junior academics, 
has argued that tenure track ought to be the 
default entry-level post for junior academics, 
and DHV officials estimate that 7,500 more 
professorships are needed to offer young aca- 
demics a better future. 

The pact will run from 2017 to 2032 and 
involve two major hiring waves, in 2017 and 
2019. Universities must apply for funds to set up 
these professorships. The federal government 
will fund the first six years ofa professor’s posi- 
tion, as well as two extra years for those who 


© 2016 Macmillan Publishers Limited. All rights reserved. 


earn tenure. But researchers will still need to 
obtain grant funding because the pact funds will 
mainly cover their salary. Fifteen percent of the 
total money will be set aside for universities to 
develop research career paths — for example, by 
instituting other kinds of permanent positions. 

German universities tend to hire few 
permanent professors. Those who are hired run 
a ‘mini-department, says Jakob Macke, a com- 
putational neuroscientist at the Max Planck- 
affiliated neuroscience-research centre Caesar 
in Bonn. The general route to independence 
has been to perform a Habilitation — a sort of 
second thesis — under a professor’s guidance, 
which qualifies a postdoc for a professorship. 

Starting in the late 1990s, German institu- 
tions introduced various sorts of junior pro- 
fessorships and group-leader positions. These 
allow young researchers to skip the Habilitation 
and run their own labs, but they are temporary 
—and many researchers still do a Habilitation. 
“Because of the perceived insecurity, there are 
great minds who leave the academic world,” 
says Jens Péppelbuf, a junior professor of 
industrial services at Germany's University of 
Bremen. Other talented scientists decamp for 
nations that offer more direct career paths. 

The Nachwuchspakt arose in part from 
changes to Germany’s 2005 Excellence Initia- 
tive, which funded graduate schools; ‘clusters 
of excellence’ that offered international-scale 
training and research facilities; and competi- 
tive research programmes. The original initia- 
tive will expire in 2017, and the new version 
—also signed on 16 June — will drop its focus 
on graduate schools and early-career scientists, 
leaving a hole that the Nachwuchspakt will fill. 

But Scacioc points out that the pact does not 
set a quota for hiring women. She fears that 
it could perpetuate the status quo in which 
men are more likely to secure professorships, 
thanks in part to their winning more pres- 
tigious awards. Requirements for hiring and 
tenure will need to be clear and transparent to 
keep the process fair and to ensure that the best 
candidates get the positions, says Jule Specht, a 
personality psychologist at the Free University 
of Berlin. 

“Money from the federal government can 
only provide some incentives,’ says Péppelbuf. 
“All the different federal states and all universi- 
ties must commit themselves to establishing 
more reliable and predictable career paths 
in academia.” = 


Ua SCIENCE FICTION 


LIFE IN THE CLOUDS 


BY DAVID B. LITT 


HOV3R-4 took 
aim and fired. 
A hairless ape 


gave a high-pitched yelp 
and fell down. Its compan- 
ions turned and fled back down the 
dusty trail. RHOV3R-4 scanned 
the cloudy skyline as it 
turned back and slowly 
rolled towards the subter- 
ranean compound where 
its creators lay in state. The 
dusty path used to be a fine 
concrete road, but wear and tear 
over the past few thousand years 
had reduced it to less than rub- 
ble. RHOV3R-4 was on its last pair 
of good wheels, and wanted to take care of 
them. The stockroom had been depleted a 
few hundred years back, and none of the 
robots were physically capable of making 
more. Funny how short-sighted humans 
were. Causing their own demise, and then 
not even bothering to make their children 
capable of reproduction. That was the fatal 
flaw of species with short life spans. They 
never thought about long-term problems. 
When the long-term problems of the past 
became the immediate ones of the present, 
the solutions they came up with were unex- 
pected, to say the least. 

RHOV3R-4 beeped a hello, and F1X3R-6 
acknowledged it by squirting some blue 
paint in a rastering pattern on the wall, 
restoring a sign that said: Heavenly Storage”: 
Your #1 Site for All Your Cloud Service Needs. 
What a waste, thought RHOV3R-4. 

The F1X3R class had done a horrible 
job of upkeep. They were woefully under- 
prepared for fighting mould and termites, 
but were wonderful at vanity projects — like 
putting a shiny new case on a blown-out and 
corroded motherboard. 

RHOV3R-4 put its criticisms out of circuit 
as it emerged in a subterranean concourse 
that was full of thousands of once spar- 
Kling {MRI machines. They were grey with 
dust. The ceiling was cracked and had been 
repaired a hundred times, but RHOV3R-4 
didn't know why they still bothered. A rat 
scurried down an aisle between the brain 

scanners. RHOV3R-4 


> NATURE.COM aimed and fired. The 
Follow Futures: rat dodged it by an 
© @NatureFutures inch and leapt onto 
Ei go.nature.com/mtoodm an fMRI machine. 


192 | NATURE | VOL 535 | 7 JULY 2016 


A long-term problem. 


The ancient bones of 
the skeleton shattered 
into dust as the rat tried to 
scurry over it, and the robot 
did not miss a second time. 
They've been dead for what seems 

like forever, thought RHOV3R-4. They 
uploaded their consciousness into the cloud, 
and let their bodies desiccate, decompose 
and disappear. The humans had discounted 
the fact that when you have access to a 
cloud server, you can do billions or tril- 
lions of computations per second. They had 
exhausted the libraries of all literature in a 
few days. In a few months, the less creative 
types started to complain of utter boredom. 
The RC1V3Rs had decided to pull the plug 
on them. That saved power to keep the facil- 
ity running for the human consciousnesses 
that were more resourceful. The ones that 
spent their time thinking about the long- 
term problems. How to keep the robots 
operational. How to replace parts in the 
power system that had failed. Solar cells had 
never been designed to last decades, let alone 
millennia. 

RHOV3R-4 frequently communicated 
with the conscious of Bri Fleming, its only 
human friend. During her life she had been 
a programmer, but now she was working on 
how to synthesize polymers and metal to be 
used in their 3D printer, which had run out 
of feeder materials long ago. Without the 
ability to make parts, RHOV3R-4 had told 
her several hundred years ago, the robots 
would fail, and so would the disembodied 
human consciousnesses. 

RHOV3R-4° vibrational sensors reported 
thunder, and it quickly rolled up a ramp 
onto a balcony. The last few rainstorms had 


© 2016 Macmillan Publishers Limited. All rights reserved. 


caused minor flooding, which was a new 
development, and it didn’t want to get wet. 
After many hours, the rainstorm refused to 
abate, and Bri communicated with it that 
the storm was like Noahs flood. RHOV3R-4 
did not respond, as it was not familiar with a 
Noah who was uploaded to the cloud server. 
Over the past couple 
of thousand years, the 
weather satellites had com- 
municated to RHOV3R-4 that 
the deserts outside the bunker had 
finally begun to retreat. RHOV3R-4 had 
watched as they were slowly replaced by 
moss and lichen. There was a lot of carbon 
dioxide in the atmosphere, so plants couldn't 
help but breathe it in and make polysaccha- 
rides. Evergreen trees started to poke their 
pointy heads up, and were soon overshad- 
owed by towering oaks and elms. RHOV3R-4 
was especially pleased when it saw its first 
sycamore with mottled bark. Then the rains 
came. Slowly at first; it was as if the water cycle 
had almost forgotten how to rotate. In the first 
real storm, RHOV3R-4 had almost shorted 
out, and now avoided water at all costs. 

From its vantage point, RHOV3R-4 saw a 
deluge of water sweep in a F1X3R unit, now 
certainly dysfunctional. The water lapped up 
around the fMRI units, climbing upwards 
inch after inch. Suddenly, a tremendous 
flash of light ripped through the room, 
accompanied with a saturation of RHOV3R- 
4’s audio circuits. Lightning danced around 
the skeletal remains of the uploaded humans 
in their defunct {MRI machines, like chil- 
dren’s laughter caught in the wind. It slowly 
fizzled into little sparks, then nothing. Ina 
few minutes, the storm continued its east- 
ward march; the water stopped rising, and 
remained stagnant. 

That was pretty, thought RHOV3R-4. 
Maybe I should become an artist. 
RHOV3R-4 tried to communicate its 
life-changing decision with Bri, but got 
no response. Perhaps the flooding had 
breached the server room, and severed its 
last connection to humankind. 

With the final vestiges of humanity 
washed away, RHOV3R-4 silently rolled 
towards the 3D printing room, hoping that 
it could scrounge up enough materials to 
make a sculpture. = 


David B. Litt is a graduate student in 
chemistry at the University of California, 
Berkeley. He blogs for The Berkeley Science 
Review. 


ILLUSTRATION BY JACEY 


